A Short Story on the
Long Road to Brexit

Historical Background

Relations between the UK and the rest of Europe have always been a complicated affair. Before getting into the heart of the matter, below is a summary of the major points of the UK-EU relation:


In the last few years, Brexit has been one of the most trending topics around the world. Between 2015 and Britain's official EU exit in February 2020, countless discussions and debates have been generated around the subject. Opponents speculated that the influence of Brexit would have negative impacts on certain sectors while supporters claimed the opposite. People coming from different backgrounds had a take on the matter, where views and opinions diversified and contrasted. This made it hard to determine which side was majoritarian and had an edge on the debate and who was the losing side until the EU referdum that declared, in numbers, those in favor of Brexit as the winning side.

The problem with the referudum is that it is anonymous and therefore does not tell much about the attributes of the people who voted. On the other hand, analyzing data sets that reveal the identity of individuals such as quotations or surveys that are not anonymous, makes it possible to obtain such information. In contrast to surveys or polls, quotations span over a period of time and allow therefore to track the evolution of opinions about Brexit. In fact, the debates around Brexit did not end with the referdum: they were just yet to start. And therefore we ask ourselves: "How did the perception of Brexit evolve after the referundum?"

The study aims at analyzing the evolution of the perception of Brexit throughout the years (2016 - 2020) using a quantitative approach that aggregates the views about Brexit by sector, country, age, gender and profession of speakers.

The main research question was broken down into several subquestions that will be answered throughout the analysis:

About the Data

The project data comes from 3 different sources:


178 million unique speaker-attributed quotations that were extracted and attributed using Quobert, a language-agnostic framework for quotation attribution.


A free and open knowledge base that acts as central storage for the structured data of its Wikimedia sister projects (includes Wikipedia, Wiktionary, etc).


A share index of the 100 companies listed on the London Stock Exchange with (in principle) the highest market capitalisation.

The number of quotations varies significantly with the years:

Let's have a look at the graph and analyse some interesting findings:

  • The peak in June-July 2016 coincides with the date of the referendum that took place on the 23rd of June 2016.
  • The peak on the 9th of June 2017 corresponds to the date when Theresa May lost overall majority in the UK Parliament.
  • The peak on the 13th of December 2019 corresponds to the date when Borris Johnson, UK's PM, won huge majority to get Brexit done.

It is interesting to point out that the evolution of the quotations in time is very much in line with the news and updates around Brexit.
Since Quotebank provides quotations until 2020, our analysis will be restricted to this year.

The data in brief:







Let the data speak

The British people have spoken!

First things first! Brexit has a direct and crucial impact on the British's lives. So it would be relevant to start by analysing the evolution of perception of Brexit among the British specially after the turmoil of the referundum.

The pie charts show that the perception of Brexit became less positive throughout the years and rather more negative except for 2020. The drop in popularity in the first years following the referundum might be explained by different reasons:

The opposite trend in 2020 can be explained by the fact that the impact of Covid might have hidden or masked the impact of Brexit. We speculate that, for instance, some of the problems related to food supply were rather blamed on the pandemic rather than the consequences of Brexit. In general, the perception of Brexit is rather positive than negative which is in line with the results of the referundum. So, no major switches in opinions were observed.

Using a statistical tool, Welch's t-test, the difference between the years 2018 and 2016 was found to be significant (p-value of 1.44e-7 with a chosen significance level of 0.05). On the contrary, the difference between 2016 and 2020 is not significant with a p-value of 0.615. These results confirm our previous speculations.

See EU later!

Brexit does not only involve the UK but it also impacts the relations it has with other countries specially EU countries. The Brexit has led to the creation of new laws that regulate trade and other aspects of the relationship between the UK and the rest of the world. In a nutshell, matters will eventually become more complicated not only for the UK but also for EU countries. Therefore, these countries did not hesitate to also give their opinion on Brexit.

It is important to point out that only countries for which enough data is available were kept for the analysis. The aim is not to analyse the results for all countries but rather to look at the neighbours which are the countries directly affected by the exit decision. In 2016, the perception of Brexit by the European countries was rather negative. This might be explained by the shock of the referundum where some countries perceived Brexit as the end of Europe. However, in the following year, the perception began to improve across Europe's largest economies, possibly due to due to the economic benefits of the relocation of British companies to other European countries. For example, in the case of France, financial firms moved around €150 billion from Britain to France. In 2020, the views towards Brexit where relatively neutral except for Germany that stood its ground. Italy went from being very negative about Brexit in 2016 to being positive about it in 2020. It is the country that switched its opinion the most. This switch can be the explained by the rise to power of the extreme right party "Lega Nord".

Even though the United States is not a direct neighbour of the UK, its opinion is also interesting as they have been in favour of the Brexit all along. This is not surprising as the US benefits from the instability of Europe and the potential new trade agreements that would result from a new privileged relationship with the UK. It also interesting to look at Russia that has been negative towards Brexit all along. This is a surprising result and is not in line with our expectations that Russia would be neutral towards Brexit.

However, previous hypotheses should be treated with care, since:

Wait, let's hear some professional opinions!

Now that we have an overview, let's get to the heart of the matter with an analysis of the perception of Brexit by sector. Indeed, an analysis at the country level is not necessarily very instructive given the heterogeneity of opinions within the same country. It is impossible to capture all opinions, but by grouping them by sector we believe that the results will be more interpretable. Feel free to add or remove sectors as you will using the filter option!

The cross that appears in the heat map due to the health columns having a very low p-value with all other sectors means that the distribution of sentiment in the health sector is statistically different from the distributions of the other sectors. Then if we look at the pie-chart we see that in the health sector there is generally a much more negative sentiment than in the other sectors. This can be explained by a number of reasons including likely shortages of certain medicines due to supply difficulties.

On the other hand, the economy, a sector that took a serious hit from Brexit does not show significant differences with other sectors such as academia and Tech. Which makes sense because these sectors also suffered from Brexit. When looking at the evolution of the perception of Brexit by the different sectors, it is interesting to notice that the some important sectors such as politics, economy, art and academia that are positive about Brexit became less and less positive about the exit decision throughout the years. Overall, the health is the sector that has the most negative perception of Brexit among all sectors.

What about demographics?

Besides the profession and nationality of a person, it is interesting to investigate if differences in attributes such age and gender lead to differences in opinions about Brexit.
Are older people more in favor of Brexit than younger people, as commonly believed?

Two analysis were performed for age category comparison: the first one considering all speakers, irrespective of their nationality, and a second analysis considering only speakers from UK. Additionaly, we performed the same analysis done with sectors, i.e. we verified that there exist a significant difference between each category (heat maps). As a first glance at the histograms, we were tempted to think that perception of Brexit does not depend on the age. This is validated by the obtained p-values for the entire world. The p-values are high (greater than 0.05) except for the 90-100 age category. Actually, the data available for this age category is low and therefore is not truly representative of it.

On the contrary, the UK analysis shows some difference across age categories. For instance, the age category 20-30 is significantly different from the 70-80 age category. However, young people seem to be more supportive of Brexit than old people and this severly contradicts our previous expectations. It can be explained by the following reasons:

Is there a difference in perception of Brexit by people of different genders?

From the graph and a p-value of 0.105 (Welch's t-test), it can be concluded that there is no significant difference between the gender categories. And therefore it can be inferred that the different genders do not perceive Brexit differently based on their gender.

Let's talk money!

For our last analysis, we decided to focus on the most addressed aspect of brexit, i.e. the economic aspect. Indeed, we thought it would be interesting to go further in the analysis of the economic aspect to verify if there is really a negative impact of Brexit on the economy, or if it is only opinions of economists that are not consistent with reality. For this we looked at the top 100 british stock exchange also known as FTSE100, and we tried to analyse if there was a correlation between the emergence of new events related to Brexit and movements in the stock exchange.

For each stock, the absolute value of the derivate was computed (in other terms the slope), in addition to the derivate of the number of quotations, both with respect to time. The Pearson correlation was then computed between the two. The exchange stocks for which the absolute value of Pearson correlation was greater 0.15 were kept for the analysis. When the value of the correlation is negative, it means that when the number of quotations about Brexit increases, the value of the derivative decreases and vice versa. In a nutshell, a negative correlation shows the negative impact of Brexit: it makes investors reluctant and make them prefer selling their shares. This is the case for all the stock except for POLYMETAL INTERNATIONAL PLC. It is important to note that the value of the Pearson correlation does not exceed 0.3 in absolute value for all stock markets, which is still a weak correlation. It is important to point out that correlation is not causation! The disturbance in the stock market might be explained by the presence of other factors or confounders that have not been explored in the course of the study.

Let's take a step back...

There are many reasons to why the analysis may not be fully valid. Some of the limitations are listed below:

  • Limitations related to the dataset: Quotes from quotebank about Brexit are a subset of what was said about the topic. Having more quotations might change the results and analysis In addition, speaker attribution might not always be accurate. This is problematic because a lot of information depends on the speaker, so a faulty attribution of the speaker has a direct impact on the interpretation of the data. Furthermore, the data is imbalanced among years, countries, age and gender. In addition, it is important to point out that the data only comes from journals which sheds a light on the popular class at the expense of some classes that are necessarily exposed to the media.
  • Limitations related to the data processing: A negatively labeled quotations by the Sentiment Analysis does not necessarily imply that a speaker is against Brexit.

To cut a long story short

The performed analysis was sometimes in line with our expectations and the events related to Brexit. When it was not the case, we tried speculating about the reasons for which the expected outcome was surprising. Even though our story is coming to an end, Brexit continues to impact the British lives and more globally the lives of all Europeans. Through our data story, we hope to have provided more insights into what has been happening since the referendum on the 23rd of June 2016, and above all to have provided a more global view of the situation by analysing the data over several years.

About the Team


Gaelle Abi Younes

MSc Civil Engineering


Raffaele Ancarola

MSc Computational Science


Arnaud Guibbert

MSc Microengineering


Jean Naftalski

MSc Civil Engineering