Database for football

"How Football Teams can keep track on their players statistics"



Database overview


1) Definition:

A database is a collection of information that is organized so that it can be easily accessed, managed and updated. Computer databases typically contain aggregations of data records or files, containing information about sales transactions or interactions with specific customers.

In a relational database, digital information about a specific customer is organized into rows, columns and tables which are indexed to make it easier to find relevant information through SQL or NoSQL queries. In contrast, a graph database uses nodes and edges to define relationships between data entries and queries require a special semantic search syntax. As of this writing, SPARQL is the only semantic query language that is approved by the World Wide Web Consortium (W3C).

Typically, the database manager provides users with the ability to control read/write access, specify report generation and analyze usage. Some databases offer ACID (atomicity, consistency, isolation and durability) compliance to guarantee that data is consistent and that transactions are complete.

Source: https://searchsqlserver.techtarget.com/definition/database

2) Types of databases:

Databases have evolved since their inception in the 1960s, beginning with hierarchical and network databases, through the 1980s with object-oriented databases, and today with SQL and NoSQL databases and cloud databases.

In one view, databases can be classified according to content type: bibliographic, full text, numeric and images. In computing, databases are sometimes classified according to their organizational approach. There are many different kinds of databases, ranging from the most prevalent approach, the relational database, to a distributed database, cloud database, graph database or NoSQL database.

3) Relational database:

A relational database, invented by E.F. Codd at IBM in 1970, is a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways.

Relational databases are made up of a set of tables with data that fits into a predefined category. Each table has at least one data category in a column, and each row has a certain data instance for the categories which are defined in the columns.

(Example of a relational database)

(Source:https://www.ntu.edu.sg/home/ehchua/programming/sql/Relational_Database_Design.html)

The Structured Query Language (SQL) is the standard user and application program interface for a relational database. Relational databases are easy to extend, and a new data category can be added after the original database creation without requiring that you modify all the existing applications.

4) Distributed database:

A distributed database is a database in which portions of the database are stored in multiple physical locations, and in which processing is dispersed or replicated among different points in a network.

Distributed databases can be homogeneous or heterogeneous. All the physical locations in a homogeneous distributed database system have the same underlying hardware and run the same operating systems and database applications. The hardware, operating systems or database applications in a heterogeneous distributed database may be different at each of the locations.

5) Cloud database:

A cloud database is a database that has been optimized or built for a virtualized environment, either in a hybrid cloud, public cloud or private cloud. Cloud databases provide benefits such as the ability to pay for storage capacity and bandwidth on a per-use basis, and they provide scalability on demand, along with high availability.

A cloud database also gives enterprises the opportunity to support business applications in a software-as-a-service deployment.

6) NoSQL database:

NoSQL databases are useful for large sets of distributed data.

(Advantages of an NoSQL database) (Source: https://www.educba.com/what-is-nosql-database/)

NoSQL databases are effective for big data performance issues that relational databases aren't built to solve. They are most effective when an organization must analyze large chunks of unstructured data or data that's stored across multiple virtual servers in the cloud.

7) Graph database:

A graph-oriented database, or graph database, is a type of NoSQL database that uses graph theory to store, map and query relationships. Graph databases are basically collections of nodes and edges, where each node represents an entity, and each edge represents a connection between nodes.

Graph databases are growing in popularity for analyzing interconnections. For example, companies might use a graph database to mine data about customers from social media.

Graph databases often employ SPARQL, a declarative programming language and protocol for graph database analytics. SPARQL has the capability to perform all the analytics that SQL can perform, plus it can be used for semantic analysis, the examination of relationships. This makes it useful for performing analytics on data sets that have both structured and unstructured data. SPARQL allows users to perform analytics on information stored in a relational database, as well as friend-of-a-friend (FOAF) relationships, PageRank and shortest path.

Source: https://searchsqlserver.techtarget.com/definition/database


Main advantages


1. Improved data sharing

An advantage of the database management approach is, the DBMS helps to create an environment in which end users have better access to more and better-managed data.

Such access makes it possible for end users to respond quickly to changes in their environment.

2. Improved data security

The more users access the data, the greater the risks of data security breaches. Corporations invest considerable amounts of time, effort, and money to ensure that corporate data are used properly.

A DBMS provides a framework for better enforcement of data privacy and security policies.

3. Better data integration

Wider access to well-managed data promotes an integrated view of the organization’s operations and a clearer view of the big picture. It becomes much easier to see how actions in one segment of the company affect other segments.

4. Improved data access

The DBMS makes it possible to produce quick answers to ad hoc queries. From a database perspective, a query is a specific request issued to the DBMS for data manipulation—for example, to read or update the data. Simply put, a query is a question, and an ad hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to the application. For example, end users, when dealing with large amounts of sales data, might want quick answers to questions (ad hoc queries) such as:

- What was the dollar volume of sales by product during the past six months?

- What is the sales bonus figure for each of our salespeople during the past three months?

- How many of our customers have credit balances of 3,000 or more?

5. Improved decision making

Better-managed data and improved data access make it possible to generate better-quality information, on which better decisions are based. The quality of the information generated depends on the quality of the underlying data.

Data quality is a comprehensive approach to promoting the accuracy, validity, and timeliness of the data. While the DBMS does not guarantee data quality, it provides a framework to facilitate data quality initiatives.

7. Increased end-user productivity

The availability of data, combined with the tools that transform data into usable information, empowers end users to make quick, informed decisions that can make the difference between success and failure in the global economy.

Till now we have seen different benefits of database management systems. But it has certain limitations or disadvantages.

Source: http://www.myreadingroom.co.in/notes-and-studymaterial/65-dbms/462-advantages-and-disadvantages-of-dbms.html


Real Madrid's attack tatistics for season 2013/2014 using Fire Base Real time database


For the case of football, where each team possesses around 25 players, a good way to record their season's performance and statistics is to use real time databases. In fact, keeping track of each player number of goals, assists, minutes played for each competition, number of injuries, occasions created etc. maybe very difficult without the help of a database that is linked to various games recording algorithms.

FireBase could be a great tool to record football players statistics. For instance, I decided to try it on Real Madrid (my unconditional favourite team) for its attack for the season 2013/2014, where the club won its 10th Champions league.

Madrid's attack for this season was composed of three world class players: Cristiano Ronaldo (born February 5th, 1985, Portugal), Karim Benzema (born December 19th, 1987, France) and Gareth Bale (born July 16th, 1989, Wales) that were crucial for the title conquest due to their season's exceptional performances.

My idea was to design a database that will show clearly the performances of the three players in terms of efficiency on the pitch. Normally, the players performances are shown on board that show many types of player attributes that are very complicated to understand due to lack of clarity. Using the Firebase database will improve data sharing, access and understanding of the trio performance.

Using FireBase, I was able to design the performance of the three players in the three main competitions played by the club each year: UEFA champions league (the Europe club cup), La Liga (Spanish Championship) and the Copa Del Rey (Spanish Cup). That way, we have all the informations on the three players stored in a database: https://hicham-s-database.firebaseio.com/

Also, if you want to use the informations, you can export that Json document by copying that code:

That way, if you feel fancy to become a football analyst, here's a good start !