Bullshitters: Who Are They and What Do We Know about Their Lives?


1 DISCUSSION PAPER SERIES IZA DP No. 12282 Bullshitters. Who Are They and What Do We Know about Their Lives? John Jerrim Phil Parker Nikki Shure APRIL 2019

2 DISCUSSION PAPER SERIES IZA DP No. 12282 Bullshitters. Who Are They and What Do We Know about Their Lives? John Jerrim UCL Phil Parker Australian Catholic University Nikki Shure UCL and IZA APRIL 2019 Any opinions expressed in this paper are those of the author(s) and not those of IZA. Research published in this series may include views on policy, but IZA takes no institutional policy positions. The IZA research network is committed to the IZA Guiding Principles of Research Integrity. The IZA Institute of Labor Economics is an independent economic research institute that conducts research in labor economics and offers evidence-based policy advice on labor market issues. Supported by the Deutsche Post Foundation, IZA runs the world’s largest network of economists, whose research aims to provide answers to the global labor market challenges of our time. Our key objective is to build bridges between academic research, policymakers and society. IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author. ISSN: 2365-9793 IZA – Institute of Labor Economics Schaumburg-Lippe-Straße 5–9 Phone: +49-228-3894-0 Email: [email protected] 53113 Bonn, Germany www.iza.org

3 IZA DP No. 12282 APRIL 2019 ABSTRACT Bullshitters. Who Are They and What Do We Know about Their Lives? ‘Bullshitters’ are individuals who claim knowledge or expertise in an area where they actually have little experience or skill. Despite this being a well-known and widespread social phenomenon, relatively few large-scale empirical studies have been conducted into this issue. This paper attempts to fill this gap in the literature by examining teenagers’ propensity to claim expertise in three mathematics constructs that do not really exist. Using Programme for International Student Assessment (PISA) data from nine Anglophone countries and over 40,000 young people, we find substantial differences in young people’s tendency to bullshit across countries, genders and socio-economic groups. Bullshitters are also found to exhibit high levels of overconfidence and believe they work hard, persevere at tasks, and are popular amongst their peers. Together this provides important new insight into who bullshitters are and the type of survey responses that they provide. JEL Classification: I24, J16 Keywords: PISA, overclaiming, bullshit Corresponding author: Nikki Shure Department of Social Science UCL Institute of Education University College London 20 Bedford Way London, WC1H 0AL United Kingdom E-mail: [email protected]

4 1. Introduction Frankfurt (2005) seminal essay - book On Bullshit , turned defines and discusses the In his - He begins by stating that “ One of the seemingly omnipresent cultural phenomenon of bullshit. most salient features of our culture is that there is so much bullshit. Everyone knows this. Each of us contributes his ” (Frankfurt, 2005: 1). His book spent weeks on the New York Times ’ share bestsellers list in 2005 and has recently been in the post - truth age to better understand cited Donald Trump ( e.g. Jeffries, 2017; Heer, 2018; Y glesias , 2018). Other philosophers ha ve since expa , most notably G. A. Cohen in his essay nded on his work ( “Deeper into Bullshit” , but there has been limited large scale empirical research Cohen 2002) into this issue . We fill this important gap in the literature by providing new cross - nationa l evidence on who is more likely to bullshit and how these individuals view their abilities and social status . This is an important first step in better understanding the seemingly ubiquitous phenomenon of bullshit. We make use of an influential cross - nat ional education survey administered every three years by the Organisation for Economic Cooperation and Development (OECD), namely the Programme for International Student Assessment (PISA). This data is commonly used by the OECD and education researchers to benchmark education systems or the performance of specific subgroups of pupils e.g. ( Anderson et al., 2007 ; Jerrim and Choi, 2014 ; Parker et al., 2018 ), but has never been used to compare participants in terms of their across countries bulls This paper fills this important gap in the literature. proclivity to hit. Previous academic work on bullshit has been limited and mostly theoretical. Black (1983) “ humbug ” , the predecessor of bullshit , which he defines as edited a collection of essays on deceptive mis representation, short of lying, especially by pretentious word or deed, of “ ” (Black, 1983: 23) Frankfurt (2005) is the first somebody's own thoughts, feelings or attitudes . of theoretical treatment “ bullshit ” and he situates it in terms of pr evious the concept of philosophical traditions. A crucial aspect of bullshitting in Frankfurt’s work is t he fact that bullshitters have no concern for the truth , which is different than a purposeful lie (Frankfurt, and focus . Cohen 2005: 54) responds to Frankfurt’s essay es on a slightly different definition of bullshit where “ the character of the process that produces bullshit is immaterial ” (Cohen, 2002: 2). Petrocelli ( 2018) looks at the to explore bullshitting empirically . He is one of the few studies bligation t o provide an opinion the o “antecedents of b ullshit” , namely: topic knowledge, 2

5 hypothesis i.e. individuals are more likely to bullshit when they feel social pressure to provide ( ) and the “ ease of passing bullshit hypothesis ” ( i.e. people are more willi ng to bullshit a response ) . He finds that participants are more likely to bullshit believe they will get away with it when . when there is pressure to provide an opinion, irrespective of their actual level of knowledge Petrocelli also concludes that individuals are ore likely to bullshit when they believe they can m get away with it, and less likely to bullshit when they know they will be held accountable for the (Petrocelli, 2018). His work uses smaller sample sizes than our work responses they provide (N ≈ 500) and d oes not answer the question of who bullshitters are and how they view their abilities or social standing. Pennycook et al. (2015) is the only other empirical study focused on bullshit. They present vacuous statements constructed out experiment participants with “ pseudo - profound bullshit ” - of buzzwords to ascertain when they can differentiate bullshit from meaningful statements - and create a Bullshit Receptivity (BSR) scale Their results point to the idea that some people . ps - profound bullshit , especially if they have a more may be more receptive towards eudo study intuitive cognitive style or believe in the supernatural (Pennycook et al., 2015). Their focuses on ability to detect bullshit and the mechanisms behind why some people cannot detect bullshit, ra ther than proclivity to bullshit, which is the focus of this paper . In psychology, there has been a related literature on overconfidence and overclaiming . More and Healy (2008) provide a thorough overview of existing studies on overconfidence and distingui sh between “overestimation”, “overplacement”, and “overprecision” as three distinct types of overconfidence. Overestimation occurs when individuals rate their ability as higher than it is actually observed to be, overplacement occurs when individuals rate themselves relatively higher than their actual position in a distribution, and overprecision occurs when individuals assign narrow confidence intervals to an incorrect answer , indicating (More a nd Healy, 2008) . The type overconfidence in their ability to answer questions correctly of questions we use to construct our bullshit scale are closely related to overestimation and overprecision since the individuals need to not only identify whether or not they are familiar with a mathematical concept, but also ass ess their degree of familiarity. they have verclaiming occurs when individuals assert that Similar to how we define bullshit, o In one of the first studies on overclaiming, Philips a concept that does not exist. knowledge of x of overclaiming based on how often individuals report and Clancy (1972) create an inde consuming a series of new books, television programmes, and movies, all of which were not 3

6 real products. They use this index to explore the role of social desirability in survey responses. nd Cunningham (1992) also construct a scale of overclaiming using foils , fake Stanovich a uthors and signal - concepts mixed into a list of real concepts, ction logic for a and magazines dete to examine author familiarity. In both of these studies, however, the focus is n ot on the actual overclaiming index. Randall and Fernandes ( ) also construct an overclaiming index, but 1991 - reported ethical conduct. use it as a control variable in their analysis of self Paulhus, Harms , Bruce, and Lysy (2003) focus more directly on over claiming. They construct an overclaiming index using a set of items, of which one - fifth are non - existent, and employ a . signal detection formula to measure over cla iming and actual knowledge - They find that overclaiming is an operationalisation of self - enhan cement and that narcissists are more likely ( to overclaim than non - narcissists (Paulhus et al., 2003) . Atir, Rosenzweig, and Dunning 2015) find that people who perceive their expertise in various domains favourably are more likely to overclaim Pennycock a nd Rand (2018) find that overclaimers perceive fake news to be more . Similar to Atir et al. (2015), we find that young people who score higher on our accurate. - efficacy and bullshit index also have high levels of confidence in their mathematics self problem - solving skills. on the related issues of bullshitting, overconfidence and We contribute to the existing literature young people overclaiming three important ways. First, we use a large sample of 40,550 in from nine Anglophone countries to examine bullshi t, which enables us to dig deeper into the differences between subgroups (e.g. boys versus girls, advantaged vs. disadvantaged young people) . Second, we provide the first internationally comparable evidence on bullshitting. We use confirmatory factor analy to construct our scale and test for three hierarchical levels of sis measurement invariance (configural, metric and scalar) . This allows us to compare average scores on our bullshit scale across countries in a robust and meaningful way . Finally, we also ex amine the relationship between bullshitting and various other psychological traits, including overconfidence, self - perceptions of popularity amongst peers and their reported levels of perseverance. Unlike many studies, we are able to investigate d iffer ences between previous bullshitters and non - bullshitters conditional upon a range of potential confounding characteristics (including a high - quality measure of educational achievement) providing ed to these important stronger evidence that bullshitting really is independently relat psychological traits. 4

7 Our findings support the view that young men are, on average, bigger bullshitters than young women, and that socio economically advantaged teenagers are more likely to be bullshitters - ers. There is also important cross national variation, with young than their disadvantaged pe - knowledge and people in North American more likely to make exaggerated claims about their abilities than those from Europe. Finally, we illustrate how bullshitters display overconfidence in and are more likely to report that they work hard when challenged and are popular their skills, young people at school than other . The paper now proceeds as follows. Section 2 provides an overview of the Programme for ) 2012 data and our empirical methodology. This is International Student Assessment (PISA accompanied by Appendix A, where we discuss how we test for measurement invariance of the latent bullshit scale across groups. Results are then presented in section 3, with discussion wing in section 4. and conclusions follo Data 2. Throughout this paper we focus upon Programme for International Student Assessment (PISA) data collected in 2012 . Although around 70 countries participated, we focus upon nine countries where English is the most commonly spoken lan guage in order to minimise concerns 1 ti A about translation of survey items and hence comparability . - stage survey design was mul used, with schools first divided into a series of strata and then randomly sampled with probability proportional to size. From w ithin each school, a sample of around 3 0 15 - year - olds were randomly selected to participate. 2,689 schools and 62,969 A total of pupils took part in our nine Anglophone countries the study from across , reflecting official response rates of around 80 percen t . In all nine countries, t he sample was therefore fully compliant with the strict Final student senate weights are applied throughout standards set by our analysis, the OECD. red at the tandard errors are cluste with each country being given equal weight. Likewise, s school level in order to take the complex PISA survey design into account. PISA is primarily designed to measure the mathematics, science and reading skills of 15 - year - olds across countries via a two - also complete a hour achievement test. However, participants - minute 30 young people’s demographic background questionnaire that gathers information on and their knowledge, attitudes and experience of subjects they study at school. Mathematics was the focus of PISA 2012, with most test and questionnai centred around this subject . re items 1 Our identification of bullshitters relies upon participants’ responses to some ‘fake’ questions, as shall be We are concerned about how well these fake constructs translate to languages outside of English, discussed below. and hence focus upon the nine Anglophone countries included within the sample. 5

8 Another somewhat unusual feature of 2012 was that young people were randomly PISA in one of three different versions of th e background questionnaire. assigned to complete out 2 to the random sub Throughout our anal ysis , we - sample of 40,550 this paper young restrict people from Anglophone countries who completed either form A or form C , which included the following question : Thinking ms? ’ ‘ about mathematical concepts: how familiar are you with the following ter 1 A list of items were then given to students, who were asked to indicate their know ledge of 6 that particular mathematics concept on a five - point scale (ranging from ‘ never heard of it’ to ‘know it well, understand the concept’ ). These constructs were: 1. Exponential fu nction 2. Divisor 3. Quadratic function 4. Proper number Linear equation 5. Vectors 6. Complex number 7. Rational number 8. Radicals 9. 10. Subjunctive scaling 11. Polygon Declarative fraction 12. 13. Congruent figure 14. Cosine 15. Arithmetic mean 16. Probability f them (item s Critically, of these 16 constructs, three o 4, 10 and 12) are fake ; students are asked about their familiarity with some mathematics concepts that do not exist. We use p articipants responses to these three items to form our ‘bullshit’ scale This is done via estimation of a . Confirmator y Factor Analysis (CFA) model, with the three fake items treated as observed 2 Sample sizes by country are as follows: Australia 9,246; C anada 13,901; England 2,685; Ireland 3,267; Northern Ireland 1,430; New Zealand 2,762; Scotland 1,901; USA 3,193; Wales 2,165. Although sample sizes differ, senate weights are applied when data is pooled across countries to ensure each nation receives equa l weight. 6

9 indicators of the latent bullshit construct. These MGCFA models are fitted using Mplus student and - 2017 ) , with the final Muthén, 1998 weights applied and standard errors (Muthén 3 s c lustered at the school level. A WLSMV estimator with THETA parameteri was used ation to account for the ordered categorical nature of the questions (Muthén et al. 2015). As our aim is to compare average scores on this scale, it is important that we inves tigate whether the latent construct is consistently understood and measured in the same way across demographic groups. We therefore follow standard practise and test for three hierarchical levels thin each country according to the of measurement invariance (configural, metric and scalar) wi following demographic characteristics: • Gender • Socio - economic status • Mathematics achievement quartile • Immigrant status Further details around this methodology and the measurement invariance results can be found in Appendix A. Bullshit scale scores are then derived from these MGCFA models, with average scores then compared across groups where full or partial scalar measurement invariance holds. The bullshit scale has been standardised to mean zero and standard deviation one within each country, so that all differences between groups can be interpreted in terms of an effect size. - efficacy Measures of self Within our analysis, we consider whether young people who score highly on the bullshit scale psychological characteristics , the first of which is overconfidence also display a series of other - efficacy. Spe cifically, a s part of the PISA background as measured by their mathematics self questionnaire, participants were asked how confident they are in being able to complete the , according to llowing eight tasks a four fo - point scale ( ranging from very confident to not confident at all): Task 1. Using a train timetable to work out how long it would take to get from one place to another. 3 WLSMV is an estimator which is suitable for categorical variables. It performs a probit regression using a robust weighted least squares estimator with a diagonal weight matrix. THETA parameteri s ation allows the residual - variances of the laten t trait to be parameters in the model, while excluding scale factors ( Muthén & Muthén, 1998 2017). 7

10 Task 2. Calculating how much cheaper a TV would be after a 30% discount. res of tiles you need to cover a floor. Calculating how many square met Task 3. Task 4. Understanding graphs presented in newspapers. Solving an equation like 3x+5 = 17. Task 5. Finding the actual distance between two place s on a map with a 1:10,000 scale. Task 6. Task 7. Solving an equation like 2(x+3) = (x+3)(x - 3). Task 8. Calculating the petrol consumption rate of a car. Throughout our analysis we dichotomise teenagers’ responses , so that we compare the percentage of young people who said they are confident/very confident to the percentage who said they were not confident/not confident at all. The survey organisers have also created a mathematics ‘self - efficacy’ scale, combining young people’s responses to these eight items is into a sing le continuous index. We standardise this scale so that the mean in each country zero and the standard deviation on e. Further details on how we use such measures within our analyses is provided in section 3 below. - belief in problem - solving abil ities Self Students were asked to indicate how well they believe the following five statements describes t hem and their problem - solving ability: I can handle a lot of information. 1. 2. I am quick to understand things. 3. I seek explanations for things. 4. I can easily li nk facts together. 5. I like to solve complex problems. Throughout our analysis, we consider responses to these five items, focusing upon whether respondents said that the statement was very much/mostly like me (coded one) or if they indicated that it was som ewhat/not much/not like me at all (coded zero). A total scale score has again also been derived by the survey organisers, which we standardise to mean zero and standard deviation one. 8

11 Self - popularity at school reported self reporte d views on their popularity at school, they were asked teenagers’ - To capture r school, to what extent do you agree with the following statements’: ‘ thinking about you I feel like an outsider (or left out of things) at school. 1. 2. I make friends easily at school. I feel like I belong at school. 3. 4. I feel awkward and out of place in my school. Other students seem to like me. 5. I feel lonely at school. 6. I feel happy at school. 7. Things are ideal in my school. 8. I am satisfied with my school. 9. Responses were to be given on a four - point scale, with our analysis of individual questions combining the strongly agree/ag ree categories and the disagree/strongly disagree categories into a binary scale . - Self reported measures of perseverance A series of five items were used in the background questionnaire to capture teenagers’ self - Specifically, they were asked ‘ how well does reported perseverance with challenging tasks. each of the following statements below describe you ’ with responses given on a five point scale - mostly like me, (very much like me, somewh at like me, not much like me, not at all like me): 1. When confronted with a problem, I give up easily. 2. I put off difficult problems. I remain interested in the tasks that I start. 3. 4. I continue working on tasks until everything is perfect. 5. When confronted with a problem, I do more than what is expected of me. We again recode responses to these questions into a binary format, with very/mostly li ke me coded as one and zero otherwise. An overall scale combining information from all five items has also been derived and standardised to mean zero and standard deviation one. Problem - solving approaches As part of the background questionnaire, two hypothetical scenarios were set out to students, who were then asked how they would respond. The first scenario asked: 9

12 ‘ S se that you have been sending text messages from your mobile phone for several weeks. uppo You want to try and solve the problem. What Today, however, you can’t send text messages. ?’ would you do 1. I press every button possible to find out what is wrong. I think about what might have caused the problem and what I can do to solve it. 2. 3. I read the manual. I ask a friend for help. 4. whether they would (a) definitely do this; (b) probably do this; (c) Students were asked probably not do this or (d) would definitely do thi s , for each of the four statements above. We combine options (a) with (b) and (c) with (d), allowing us to compare young people who said they would probably/ definitely use each strategy versus those would definitely/probably would not. followed a similar structure, with participants asked: The second scenario Suppose that you are planning a trip to the zoo with your ‘ route brother. You don’t know which to take to get there. What would you do? ’ 1. I read the zoo brochure to see if it says how to get there. 2. I st udy a map and work out the best route. 3. I leave it to my brother to worry about how to get there. 4. I know roughly where it is, so I suggest we just start driving. Participants were provided the same four response options (a to d) as per scenario 1, which also convert into a binary format as described above. we Analytic models After comparing average bullshit scale scores across demographic groups , and across countries, investigat e the self - we - efficacy, problem - solving skills , perseverance and reported self po pularity of bullshit ters . To begin, we divide participants into four approximately equal groups (quartiles) based upon their scores on the bullshit scale. Those in t he bottom quartile are then labelled bullshitters’ (i.e. those young people who - overwh elmingly said that they ‘non had not heard of the fake mathematics constructs ) with the top quartile defined as the bullshitters (i.e. young people who claimed expertise in the fake construct s ). Then, for these two groups, we compare how they responded to each of the self - efficacy, problem - solving, above . popularity and perseverance questions (and overall scale scores) described 10

13 A limitation with such summary statistics is that there could be confounding characteristics espect to self - will be particularly important driving the results. For instance, with r efficacy, it to consider whether bullshitters are much more likely to believe that they can complete each of the eight mathematics tasks than non bullshitters after conditioning upon their actual measured - mic ability. In other words, do acade who bullshit about their mathematics knowledge teenagers also display overconfidence in their mathematics skills? For each of the outcome measures e the following OLS regression mo del described in section 2, we therefore estimat within each 4 country : 푂 = 훼 + 훽 . 퐵푆 + 훾 . 퐴 + 훿 . 푆퐸푆 + 휏 . 퐷 + 푢 휀 + # "# "# Where: 푂 = The outcome variable of interest (e.g. teenagers’ self - efficacy) . "# ecting quartiles of the bullshit scale BS = A set of dummy variables refl . A = Teenagers’ ent in mathematics, reading, science and problem solving, academic achievem 5 . as measured by the PISA test socio - Teenagers’ SES = economic status, as measured by the PISA Economic, Social and Cultural Status (ESCS) index. teenagers’ D = A vector of controls for characteristics (e.g. gender and demographic immigrant status). = School fixed effects. 푢 # i . = Student i j = School j. The parameters of interest from these model s are the estimated 훽 coefficients. These will reveal differences between the bullshitter and non - bullshitter groups, conditional upon their gender, io - soc - solving skill s and economic status, mathematics, reading, science and problem attendance within the same school. Such conditional associations will help reveal whether bullshitter s provide different answers to the self - efficacy, perseverance, popularity and 4 As we have dichotomised participants’ responses, this is equivalent to estimating a linear probability model for each item , along with a standard OLS model for the overall scale score. 5 We include controls for all PISA plausible values in each subject area. 11

14 problem - s, compared to non - bullshitters of the same demographic solving question background, of equal academic ability and within the same school. Results 3. are the bullshit Who ? ters Table 1 considers how average scores on the bullshit scale differ betw een demographic groups. There is an important difference between genders; boys are much more likely to be bullshitters than girls. This holds true across all nine countries, with all difference s statistically significant between 0.4 and and of equivalent to an effect size 0. 5 standard deviations in most countries. Its is also notable how the gender gap in bullshitting is significantly weaker in North America (0.25 in the United States and 0.34 in Canada) than it is in Europe (e.g. gender gaps of between 0.4 and 0.5 are observed for England, Ireland, Scotland and Wales). Consequently , Table 1 provides strong and consistent evidence that teenage boys are bigger bullshitters than teenage girls. << Table 1 >> A similar difference is found with respect to socio - economic status; young people from more - bullshit scores than their less advantaged socio economic backgrounds have higher average advantaged peers. The magnitude of the difference is again not trivial and varies somewhat across countries. For instance, the difference in average bullshit scores between the top and - bottom socio economic quartile stands above 0.6 standard deviations in Scotland and New Zealand, but below 0.3 in England, Canada and the United States . Nevertheless, in all nine countries , the dif ference is statistically significant at the five percent level. T hese results therefore provide strong evidence that young people from more affluent backgrounds are more likely to be bullshitters than young people from disadvantaged backgrounds. The final difference considered in Table 1 is between immigrant and native groups. In most countries, immigrants having significantly higher scores than young people who are country natives This is particularly pronounced in European Anglophone countries, where im migrants . typically score around 0.35 standard deviations higher on the bullshit scale than young people who were born in the country. The association is typically slightly weaker outside of Europe, with there actually being no difference between immigrants and natives in the United States. Hence, although we find a general pattern of immigrants being bigger bullshitters than natives, the strength of this association seems to vary quite substantially between countries (and, thus, characteristics and home loc ations of the immigrant groups). 12

15 Finally, in additional analysis, we have also estimated the within versus between school variation of the bullshit scale within each country. Our motivation was to establish whether hin the same school, or if bullshitters are fairly equally bullshitters tend to cluster together wit distributed across schools. We find that the ICCs tend to be very low; in most countries less than three percent of the variance in the bullshit scale occurs between schools. This perhaps helps to explain why everyone knows a bullshitter; these individuals seem to be relatively evenly spread across schools (and thus peer groups). Are teenagers in some countries bigger bullshitters than others ? Table 2 provides our comparison of the bullshit scale a cross eight of the nine Anglophone 6 countries The top panel provides the average standardised scale score, while the bottom panel . provides t - statistics for pairwise comparisons across countries. Green shading with an asterisk highlights where differences across countries are statistically significant at the five percent level. >> << Table 2 Three broad clusters of countries seem to have emerged. At the top of the rankings are the two cale scores of 0.25 North American countries of the United States and Canada. With average s and 0.3, these two countries have significantly higher bullshit scores than any other country. The next three countries (Australia, New Zealand and England) are in the middle of the rankings. Teenagers in these countries exaggerate less about their prowess, on average, than – young people in Canada and the United States by a magnitude equivalent to an effect size of around 0.1. However, they are also significantly bigger bullshitters than young people from Ireland, Northern Ireland and S cotland who form the final group. The average bullshit scale score in these countries ranges between approximately - 0.26 (Ireland and Northern Ireland) and - 0.43 (Scotland) which is significantly lower than every other country. Moreover, the difference bet these countries and North American is sizeable; equivalent to an effect size ween greater than 0.5. Consequently, despite speaking the same language, and with a closely shared culture and history, we find important variation across Anglophone countries in teenagers’ propensity to bullshit. 6 Wales has been excluded from this comparison; see Appendix A for further details. 13

16 A psychological profile of bullshitters 3 Table now turns to how bullshitters responded to other items included in the PISA These results are based upon the pooled sample including young background questionnaire. from across the Anglophone countries. peop The top panel refers to their self - confidence in le completing the eight mathematics tasks described in section 2, while the bottom panel - illustrates how they view their problem percentage of solving abilities. Figures refer to the who believe they could complete the task relatively easil y or who believe that young people they have each specific problem solving skill. The final row s provide the average score for - bullshitters and non - bullshitters on the self - efficacy and problem - solving scales. These can be interpreted in terms of an effect size. The raw difference in estimates between these groups are then reported, along with the regression model estimates that control for demographic ent and school fixed - effects. background, prior academic achievem 3 >> << Table Starting with the results for self - efficacy, there are substantial and statistically significant differences between the high and low bullshit groups on the eight questions asked. For instance, whereas just 40 pe rcent of non - bullshitters were confident that they could work out the petrol consumption of a car (task eight), two - thirds of the bullshitter group claim ed that they could do this. Moreover, a sizeable difference can still be observed in the regression mod el results, - skills than illustrating how bullshitters express much higher levels of self confidence in their non - bullshitters, even when they are of equal academic ability. Specifically, the difference in - efficacy scale score is approachin g the average self 0. 5 standard deviations ; a large and statistically significant effect. Together, these results illustrate how young people who tend to bullshit are also likely to express overconfidence in their skills. The lower panel of Table 3 confirms these results. When asked about their problem - solving skills, bullshitters are around 20 percentage points more likely to say that they ‘ can handle a lot of information ’, ‘ c an easily link facts together ’, ‘ are quick to understand things ’ and ‘ like to solve complex probl ems ’. Although controlling for achievement, demographics and school characteristi some of the difference between the high and low bullshit groups, cs can explain significant differences remain ; we continue to observe an effect size difference of around one - quarter of a standard deviation , even after such characteristics have been controlled. This again bullshitters . demonstrates the overconfidence expressed by 14

17 In Table 4 we report results for young people’s self - reported perseverance. Bullshitters are less likely to say that they give up easily when faced with a difficult problem ( eight much 7 percent) and that they are put off by difficult problems (15 versus 27 versus 1 . Yet percent) they are more likely to say that they exceed expectations when faced with a difficult problem 5 (4 8 percent). In other words, bullshitter s claim to have particularly high levels of versus 2 perseverance when faced with challenging tasks. The difference in the average perseverance scale score between the high and low bullshit grou ps is 0.5 0 standard deviations (0. 41 once controls have been added) representing a sizeable and statistically significant effect. Table 4 - - solve problems than therefore illustrates how bullshitters claim to persevere more with hard to ent of a range of background characteristics. other groups, independ << Table 4 >> How do bullshitters claim that they solve problems? Table 5 provides some insight into this issue by summarising how they said they would solve two routine tasks (see section 2 for further detail s). Interestingly, the most pronounced and statistically significant results are with respect to the most ‘socially desirable’ (or the most ‘ obviously sensible ’ ) strategy. For instance, if their mobile phone stops sending text message s, bullshitters are s o mewhat less likely to say that they would press all the buttons to find out what is wrong (49 versus 5 percent) but much 6 consult the instruction manual (4 1 versus 2 9 percent). more likely to say that they would route to t heir destination, bullshitters are much more likely Likewise, if they do not know the 71 versus 5 5 to say that they would consult a map than other groups ( percent). Although we do not know what strategy these would actually use, Table 5 nevertheless provides young people at bullshitters are much more likely to say that they would take the most some indication th obviously sensible approach. << Table 5 >> Finally, do bullshitters believe that they are popular at school? 6 provides some Table ‘school well n suggestio this may be the case . The average - being’ scale score is around 0. 2 that standard deviations higher for bullshitters, and stays at this level even after achievement, demographic and school controls have been added. There is a particularly notable difference percent of bullshitters the questi on ‘ things are ideal at my school ’, to which 75 in response to agree (compared to 64 percent of non - bullshitters ) . Therefore, although the evidence is perhaps the previous topics considered, we nevertheless find some evidence that weaker than for 15

18 bullshitters are particularly likely to believe that they are popular at school (and certainly no less popular than their non bullshitting peers). believe they are - 6 >> << Table Investigations of possible alternative explanations There eats to the validity of our interpretation of the results above . The first are two primary thr alternative explanatio n is that, rather than capturing young people’s propensity to bullshit, the three fake constructs provide evidence of a careless or extreme response style. For instance, some respondents may not be taking the questionnaire ser iously, and are simply ticking the top category for every question. A second possibility is that young people’s responses are reflecting social desirability bias; that they are providing re sponses that they believe will be viewed as positively by others (e.g. that they know various mathematics concepts, that they work hard at school etc). Both of these possibilities could lead to a spurious correlation between our bullshit index and the vari ous other psychological traits investigated in the previous sub - section. Similarly, if children with certain characteristics (e.g. boys, immigrants, young people in countries) particular are more likely to provide careless or socially desirable responses, then this could explain differences between demographic groups. why we observe how our bullshit index is related to young To explore this possibility further, we investigate responses to two other question s in the PISA background questionnaire: ( a) test people’s motivation and (b) truancy at school. to provide the amount Specifically, children were asked of effort they put into the PISA study using a zero to ten scale and how many times they were absent from school over the last two week . If respondents a re indeed providing high responses s consistently across questions – either due to carelessness, response style or social desirability – observe a strong correlation between our bullshit index and young people’s then we should - reported truancy and test self These results are presented in Table 7. motivation. << Table 7 >> We find no evidence that the bullshit index is related to young people’s test motivation ; the Pearson correlation is - 0.03 while Table 7 highlights how the average test effort reported was 7.6 within each bullshit quartile. Similarly, panel (b) of Table 7 illustrates how the bullshit scale is not associated with self - reported truancy from school. Specifically , around 80 percent of young people said they were not absent from school at any poi nt during the last two weeks, of how they responded to the questions which form our bullshit scale. regardless 16

19 Together, this provides us with reassurance that the correlations observed in the previous sub - bility bias or other forms of careless/extreme section are unlikely to be driven by social desira response. 4. Conclusions - known social phenomenon Bu summarised as a situation where an llshitting is a well . It can be , when really they individual claims to have knowledge, experience or expertise in some matter “ bullshitter ” is then assigned to someone who makes such claims on a regular do not. The label ; i.e. a person who consistently exaggerates their prowess and/or frequently tells untruths. basis probably know a bullshitter - – we all this concept is well known in everyday life – Although academic research has been conducted into this issue. What, for instance, are the very little t, and is it demographic characteristics of bullshitters? Is it a masculine or a feminine trai - onomic groups? Do young people in some countries ec something that varies between socio tend to bullshit more than those in others? And what other psychological characteristics do bullshitters display; do they display overconfidence, have a tendency to provide socially an inflated sense of popularity amongst their peers? desirable answers or have s such issue - using large This paper has attempted to explore scale, nationally representative data. Focusing upon 15 - - old s from across nine Anglophone countries , we have year ics of young people who claim to have knowledge and expertise in investigated the characterist Having derived and established the comparability three mathematics concepts which are fake. of our bullshit scale via measurement invariance procedures, we go on to find that young men re likely to bullshit than young wo somewhat are mo more prevalent men, and that bullshitting is amongst those from more advantaged socioeconomic backgrounds. Compared to other countries, young people in North America are found to be bigger bullshitters than young people in England, Australia and New Zealand , while those in Ireland and Scotland are the least likely to . Strong evidence also emerges that exaggerate their mathematical knowledge and abilities bullshitters also display overconfidence in their academic p rowess and problem - solving skills, while also reporting higher levels of perseverance when faced with challenges and providing more socially desirable responses than more truthful groups. There are of course limitations to this study, and many issues on t he topic of bullshitting that sectional rather than longitudinal. remain unexpl ored . First, the PISA data analysed are cross - We therefore do not know whether bullshitting is a stable trait that can be consistently observed for an individual over time, or i f it is something that changes with age (and the factors 17

20 associated with such change). Likewise, the implications of being a bullshitter remain unclear. connotations , being able to bullshit Although this concept often has negative convincingly may be usefu l in certain situations (e.g. job interviews, negotiations, grant applica tions ). Yet the social and labour market outcomes of bullshitters remain s unknown and is thus a key issue in need of further . research Second, our analysis has only considered the pr opensity to bullshit in a single area (knowledge of mathematics concepts) . Future work should consider the overlap between bullshitting with respect to different areas of life – such as young people’s knowledge/experience of drug taking or of their sexual experience s (for instance). This will help us to identify those individuals who consistently lie about multiple aspects of their life. Finally, it is important we recognise that our bullshit scale was based upon three items. Ideally, future resear ch should try to specific include a greater number of fake constructs in order to maximise precision of the bullshit scale. open an important new area of Despite these limitations, we believe this paper has started to social science research. Bullshitting is a wide ly recognised social ‘skill’ which is likely to have an impact upon a person’s life. We have established how some groups are clearly more likely to bullshit (and be caught bullshitting) than others, and that these individuals tend to display psychological traits (most notably a striking overconfidence in their own abilities). certain other It is critical that a developmental perspective is now taken with respect to bullshitting so that we can understand what leads individuals to develop such habits, and whe ther it turns out to be associated with better or worse social and labour market outcomes. 18

21 References - Anderson, J. O., Lin, H. S., Treagust, D. F., Ross, S. P., & Yore, L. D. (2007). Using large ematics education: scale assessment datasets for research in science and math Programme for International Student Assessment (PISA). International Journal of - 614. Science and Mathematics Education , 5(4), 591 Atir, S., Rosenzweig, E., & Dunning, D. (2015). When knowledge knows no bounds: Self - perceived expertise p redicts claims of impossible knowledge. Psychological Science , (8), 1295 - 1303. 26 Black, M. (1985). . Cornell University Press. The prevalence of Humbug and other essays . Sensitivity of goodness of fit indexes to lack measurement invariance 2007 ) Chen, F. . ( Structural Equation Modeling, 14, 464 - 504. Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness of - fit indexes for testing - measurement invariance. 9, 233 - 255. Structural Equation Modeling, Cohen, G. A. (2002). Deeper into bullshit. Contours of agency: Essays on themes from Harry Frankfurt - 339. , 321 Frankfurt, H. G. (2005). On bullshit . Princeton University Press. Heer, J. (2018, March 15). Worse than a Liar. The New Republic. Retrieved from https://newrepublic.com/article/147504/worse - liar - trump - lies - trudeau Jeffries, S. (2017, May 22). 'Bullshit is a greater enemy than lies' lessons from three new – The books post - truth era. the Guardian . Retrieved from on ht tps://www.theguardian.com/us - news/2017/may/22/post - truth - era - trump - brexit - lies - books Jerrim, J. & Choi, A. (2014). The mathematics skills of school children: How does the UK compare to the high performing East Asian nations? Journal of Education Policy, 2 9 , 349 – 376. Kenny, D.; Kaniskan, B. and McCoach, D. 2015 ) . ‘The Performance of RMSEA in Models ( With Small Degrees of Freedom.’ ociological Methods & Research 44(3): 486 - 507. S Mö bius, M. M., Niederle, M., Niehaus, P., & Rosenblat, T. S. (2011). Managing self - confidence: Theory and experimental evidence (No. w17014). National Bureau of Economic Research. Muthén, B., Muthén, L., & Asparouhov, T. (2015). Estimator choices with categorical variables. Retrieved from https://www.statmodel.com/download/EstimatorChoices.pdf . Eight Edition. Los Angeles, Mplus User’s Guide Muthén, L. K., & Muthén, B. O. (1998 - 2017). CA: Muthén & Muthén. 19

22 Niederle, M., & Vesterlund, L. (2007). Do women shy aw ay from competition? Do men The Quarterly Journal of Economics , (3), 1067 - 1101. compete too much?. 122 Parker, P. D., Marsh, H. W., Jerrim, J. P., Guo, J., & Dicke, T. (2018). Inequity and Excellence in Academic Performance: Evidence From 27 Countries. rican Educational Ame Research Journal , 0002831218760213. Paulhus, D. L., Harms, P. D., Bruce, M. N., & Lysy, D. C. (2003). The over claiming technique: - Measuring self - enhancement independent of ability. Journal of personality and social , 84 (4), 89 0. psychology Pennycook, G., Cheyne, J. A., Barr, N., Koehler, D. J., & Fugelsang, J. A. (2015). On the - Judgment and Decision making . reception and detection of pseudo profound bullshit. Pennycook, G. & Rand, D. G. (2018). Who falls for fake news? The roles of bull shit receptivity, , m imeo. overclaiming, familiarity, and analytic thinking Journal of Experimental Social Petrocelli, J. V. (2018). Antecedents of bullshitting. Psychology , 76 - 258. , 249 Putnick, D. and Bornstein, M. ( 2016 ) . ‘ Measurement Invariance Conven tions and Reporting: The State of the Art and Future Directions for Psychological Research.’ Developmental 41: 71 90. Review - Confidence Men? Gender and Confidence: Evidence among Top Sarsons, H., & Xu, G. (2016). Economists, mimeo. nningham, A. E. (1992). Studying the consequences of literacy within Stanovich, K. E., & Cu Memory & Cognition , a literate society: The cognitive correlates of print exposure. 20 - 68. (1), 51 Yglesias, M. (2017, May 30). The Bullshitter - in - Chief. Vox. Retrieved from https://www.v - and - politics/2017/5/30/15631710/trump - bullshit ox.com/policy Zieger, L.; Jerrim, J. and Sims, S. ( 2018 ) . ‘ Comparing teachers’ job satisfaction across countries. A multiple - pairwise measurement invariance approach.’ Forthcoming . 20

23 Table 1 on between demographic characteristics and average scores on the . The associati bullshit scale (a) Gender Gap (effect size) SE Girls Boys 0.23 0.24 0.48 * England - 0.04 - 0.23 0.46 0.23 0.04 Ireland * - 0.23 0.21 Scotland 0.44 * 0.05 0.21 0.21 0.42 * 0.02 Australia - - 0.42 0.21 * 0.05 Wales 0.21 * 0.20 0.40 0.20 0.04 New Zealand - - 0.18 0.17 0.35 * 0.05 Northern Ireland Canada 0.17 0.34 * 0.02 - 0.17 - 0.13 0.25 * 0.04 0.13 USA Socio - economic status (b) Gap Low SES Q2 Q3 High SES (effect size) SE - 0.36 0.08 0.09 0.30 0. 65 * 0.06 Scotland 0.03 0.09 0.33 0.29 0.62 * 0.06 - New Zealand - 0.21 0.07 - 0.02 0.23 0.44 * 0.06 Ireland - 0.18 - 0.12 0.02 Australia 0.25 0.42 * 0.03 Wales 0.17 - 0.03 0.04 0.19 0.36 * 0.06 - 0.17 - 0.09 0.02 0.12 0.29 * 0.06 England - - 0.13 - 0.07 - 0.05 0.15 0.2 8 * 0.04 Canada 0.02 0.09 USA - 0.04 0.11 0.20 * 0.06 - (c) Immigrant group Natives Immigrants Gap (effect size) SE Northern Ireland - 0.02 0.64 0.66 * 0.15 Ireland - 0.04 0.34 0.38 * 0.08 England 0.05 0.32 0.37 * 0.07 - 0.36 0.01 0.35 Wales * 0.11 - 0.36 New Zealand 0.09 0.26 - * 0.05 Scotland - 0.03 0.32 0.36 * 0.08 0.17 Canada 0.05 0.12 - * 0.03 Australia - 0.04 0.13 0.16 * 0.03 USA 0.00 0.01 0.01 0.06 Notes: The bullshit scale has been standardised within each country to mean zero and standard deviation one. The gap r efers to the difference between groups in terms of an effect size. SE refers to the standard error of the gap. Northern Ireland excluded from socio - economic status results due to factor scores not able to be calculated. * indicates statistical significance at the five percent level. 21

24 Table 2 International comparison of average bullshit scores across Anglophone . countries. (a) Average bullshit scale scores across Standard Mean error Country 0.298 Canada 0.014 0.252 0.023 USA Australia 0.010 0.179 New Ze 0.135 0.022 aland England 0.021 0.093 Ireland - 0.255 0.019 Northern Ireland - 0.265 0.027 Scotland 0.432 0.025 - - statistics for pairwise country comparisons (b) T New Northern Canada USA Australia Ireland Zealand England Ireland Scotland Canada - - - - - - - - - USA 1.72 - - - - - - Australia * 2.93 * - - 6.81 - - - - New Zealand 6.27 * 3.73 * 1.83 - - - - - England 1.39 8.25 * 5.22 * 3.75 * - - - - 13.32 Ireland 23.23 * 17.08 * 19.77 * * 12.31 * - - - - Northern Ireland 18.26 * 14.53 * 15.09 * 11.35 * 10.40 * 0 .29 - 17.13 * 22.75 * 20.43 * 25.71 Scotland - * 4.53 * 5.65 * 16.29 * Notes: Average bullshit scale scores have been standardised to mean zero and standard deviation one across the eight Anglophone countries. Wales has been excluded based upon measurement i nvariance tests. Green shaded cells in panel b indicate where difference across countries is statistically significant (absolute value of the t - statistic is greater than 1.96). Red - re not statistically shaded cells with italic font illustrates where cross country differences a significant (absolute value of the t - statistic is less than 1.96). 22

25 Table . Bullshitters’ views of their abilities 3 Unconditional Regression results Non - bullshitters Difference Difference SE Bullshitters - efficacy Self 5% 80% 88% 7% * 1.3% Believe can could complete Task 1 74% 86% 12% 10% * 1.3% Believe can could complete Task 2 58% 17% 80% 22% Believe can could complete Task 3 * 1.4% Believe can could complete Task 4 90% 11% 79% 9% * 1.2% Believe can could complete Task 5 79% 90% 11% 6% * 1.2% Believe can could complete Task 6 67% 28% 39% 19% * 1.6% 63% 81% 18% 13% * 1.5% Believe can could complete Task 7 Believe can could complete Task 8 40% 68% 28% 19% * 1.5% 0.03 Scale score (standardised) - 0.41 0.26 0.67 0.48 * V iews of problem solving ability - 14% 62% 18% * 2% Can handle a lot of information 43% 45% 62% 17% 11% * 2% Quick to understand things Seek explanations for things 66% 5% 2% 2% 61% Easily link facts together 65% 17% 49% 13% * 2% Like to solve com plex prob l ems 24% 47% 23% 18% * 2% Scale score (standardised) 0.36 0.14 0.49 0.35 * 0.03 - shitters Notes: young people who agree or strongly agree. Non bull Figures refer to percent of refers to young people in the bottom quarter of the derived bullshit s cale score distribution, while bullshitters are defined as the top quartile. Regression estimates refers to the difference between bullshitters and non - bullshitters controlling for gender, socio - economi c status, immigrant status, PISA reading, maths and sc ience scores and school fixed effects. The ‘s cale score refers to results based upon continuous index combining data across all items. This ’ row has been standardised to mean 0 and standard deviation 1, and can therefore be interpreted in terms of an effec t size. * indicates that the difference between bullshitters and non - bullshitters is statistically significant at the five percent level. - efficacy tasks can be A full list of the self found in section 2. 23

26 Table 4 . Bullshitters’ views of their per severance Regression results Unconditional (% agree) Non - bullshitters Bullshitters Difference Difference SE When confronted with a problem, I 17% give up easily 8% - 9% - 5% * 1.2% 12% I put off difficult problem 27% 15% - - 9% * 1.3% in the tasks that I I remain interested * start 46% 60% 14% 14% 1.6% I continue working on tasks until everything is perfect. 42% 55% 13% 10% * 1.7% When confronted with a problem, I do more than what is expected of me 28% 45% 17% 15% * 1.7% Scale score (standardised) - 0.37 0.13 0.50 0.41 0.03 Notes: See notes to Table 3 . * indicates statistically significant difference between - bullshitters at the five percent level. bullshitters and non 24

27 Table 5 . The bullshitter approach to problem solving Unconditiona Regression results l (% agree) Non - bullshitters Bullshitters Difference Difference SE Task 1. I press every button possible to find out 56% 49% what is wrong 7% - 3% 1.8% - I think about what might have caused the problem and what I can do to so lve it 85% 90% 5% 3% * 1.1% I read the manual 41% 12% 29% 10% * 1.7% I ask a friend for help 79% 75% - 4% - 2% 1.4% Task 2. I read the zoo brochure to see if it says how 75% to get there 1.5% 73% - 2% - 1% I study a map and work out the best route 71% 16% 9% * 1.6% 55% I leave it to my brother to worry about how to get there 34% 27% - 7% - 5% * 1.6% I know roughly where it is, so I suggest we just start driving 64% 60% - 1% 1.7% 4% Notes: See notes to Table 3 . In task 1, participants were asked ‘ Sup pose that you have been sending text messages from your mobile phone for several weeks. Today, however, you can’t send text messages. You want to try to solve the problem. Which of the following would you were asked ? In task 2, participants do ’ ‘ Suppose th at you are planning a trip to the zoo with your brother. You don’t know which route to take to get there. Which of the following would you do ? ’ Figures refer to the percent of young people who said they would either ‘definitely’ * indicates statistically significant difference or ‘probably’ use this lem - solving strategy. prob between bullshitters and non - bullshitters at the five percent level. 25

28 Table 6 . Do Bullshitters believe they are popular? Unconditional (% agree) Regression results Non - bu llshitters Bullshitters Difference Difference SE - 13% 0% 13% 2% 1.2% Left out of things at school make friends easily at school 85% 89% 4% 5% * 1.3% feel like I belong at school. 74% 81% 7% 7% * 1.5% 1% feel awkward/ out of place in my school 15% 16% 1% 1.4% Other students seem to like me 92% 93% 1% 2% 1.0% 10% feel lonely at school. 1.1% 10% 1% 0% feel happy at school. 84% 7% 77% 8% * 1.4% Things are ideal in my school. 64% 75% 10% 11% * 1.7% 8% I am satisfied with my school. 76% 83% 9% * 1.5% Scale score (standardised) - 0.11 0.07 0.18 0.20 * 0.04 Notes: See notes to Table 3 . * indicates statistically significant difference between - bullshitters at the five percent level. bullshitters and non 26

29 Table association between the bu llshit scale and young people’s test motivation 7. The and truancy from school Test motivation (a) Average Standard error 0.03 7.60 Bottom quartile 7.63 0.03 Second quartile 0.03 7.63 Third quartile 7.64 0.03 Top quartile (b) Truancy from school Bullshit index Bottom Second Third Top quartile quartile quartile quartile 80 % None 80 % 80 % 81 % One or two times % 16 % 16 % 16 % 16 2 Three or four times % 3 % 3 % 3 % Five or more times 1 % 1 % 1 % 1 % 100 Total % 100 % 100 % 100 % Notes: Figures in panel (a) refer to the a verage amount of effort children say that they put into the PISA te ; it refers to the number of times st out of 10. Panel (b) provides column percentages young people said that they skipped school for a whole day over the last two weeks. 27

30 endix . Measurement invariance tests App A Measurement invariance methods Teenagers’ responses to the three fake mathematics concepts are used to derive the bullshit scale. This is done via estimation of a Confirmatory Factor Analysis (CFA) model, with the thre e fake items treated as observed indicators of the latent bullshit construct. As our aim is to compare average scores on this scale, it is important that we investigate whether the latent ross demographic groups. construct is consistently understood and measured in the same way ac We therefore follow standard practise and test for three hierarchical levels of measurement invariance (configural, metric and scalar) within each country according to the following demographic characteristics: Gender • Socio • economi c status - • Mathematics achievement quartile Immigrant status • The intuition behind this approach, with reference to the bullshit scale , is presented in Figure 1. Ovals depict the unobserved latent construct we are trying to measure, while rectangles refer 6 young people’s observed responses to the to three fake mathematics items . Specifically, 푄 5 6 푤 in country 푥 . F actor loading are represented by 휆 represents a single question quantify , and 5 푥 bu llshit trait and question 푤 , in country the strength of the relationship between the latent . 6 On the other hand , 휏 is the ‘threshold’ and is essentially equivalent to the constant term in a 5 . regression model 28

31 Figure 1. A hypothetical example of the bullshit MGCFA model to test invariance of the scale across groups . Group 1 Group 2 Bullshit Bullshit 6 > 6 ? 휆 휆 : 6 6 6 6 > > ? ? = 휆 휆 휆 휆 = : < < 6 6 6 6 > > > 6 ? ? 6 푄 푄 푄 ? 푄 푄 = < : 푄 < = : 6 ? 6 6 6 6 6 > 휏 ? > ? > = 휏 휏 휏 휏 휏 < : < = : 6 6 The factor loadings ( ) are the main properties of the bullshit 휏 scale , and ) and thresholds ( 휆 5 5 rement invariance’ (i.e. comparability of the the key parameters used to test for ‘measu bullshit scale) across groups . Basically, measurement invariance involves putting ever more constraints 6 6 ( 휏 휆 upon the factor loadings ) , to test whether three hierarchical leve ls of ) and thresholds ( 5 5 : invariance hold. These are Configural invariance (level 1). This requires the same set of questions to be associated • ; ; 휆 휆 , with the latent trait across all groups. With respect to Figure 1, if the loadings : < ; to zero in group 1 (e.g. boys), they should also be unequal to 휆 and are all unequal = zero in group 2 (e.g. girls). • assumes that the factor loadings ( 휆 ) are equal across Metric invariance (level 2). This 6 6 6 6 6 6 ? ? > > > ? In terms of . groups and = 휆 , . If 휆 휆 = 휆 = 휆 , this means that Figure 1 휆 < < : = = : use the one can this level of invariance is established, then it is widely accepted that as an independent variable in a regression model, and that the estimated bullshit scale be fairly compared an ters c across groups. parame This level of invariance is required for average bullshit scale • S calar invariance (level 3). scores to be fairly compared across groups. This imposes the additional assumption that 휏 ) we wish to c the thresholds ( ompare are also equal. Again returning to Figure 1, we 6 6 6 6 6 6 ? > ? > > ? = 휏 and . 휏 , 휏 now also need 휏 = 휏 휏 = < < = = : : which Following standard practise in the literature ( Putnick and Bornstein 2016 ), we determine a simple Although level of measurement invariance holds through the use of ‘fit indices’. 29

32 C 휒 test is sometimes used to decide between such nested model , this is highly sensitive to Given our la and Rensvold, 2002). sample size rge sample sizes (several (Chen, 2007; Cheung thousand observations in all countries) we focus upon three alternative fit indices instead. These are the - Lewis Index (TLI) and the Root Mean Comparative Fit Index (CFI), Tucker The Square Error of approximation. such indices is that they help determine intuition behind how much worse the MGCFA model fits the data when additional equality constraints are imposed between groups. Zieger, Jerrim and Sims (2018) provide further details. We use the following widely used rules of thumb when deciding which level of measurement invariance holds according to these indices (Chen 2007): • Configural . CFI ≥ 0 . 95 , TLI ≥ 0.95. • Metric . Decrease in the CFI and TLI versus configural model of less than 0.01. • Scalar. Decrease in the CFI and TLI versus me tric model of less than 0.01. Increase in the RMSEA less than 0.01. Note that we only use the RMSEA to distinguish between metric and scalar invariance holds. This is due to the configural and metric models having low degrees of freedom, with the RMSEA kno wn to be problematic in such instances (Kenny, Kaniskan and McCoach 2015). As we shall discuss below, for some comparisons in some countries, full scalar invariance does iven by not seem to hold. Having inspected the parameter estimates, this seems to be mainly dr the ‘proper number’ item functioning differently across some groups (potentially due to this being confused with the actual mathematical concept of a ‘real number’). In such instances we relax the assumption of full scalar invariance across groups and assume only partial scalar the ‘proper number’ threshold s are allowed to differ invariance holds instead. Specifically, and an alternative bullshit scale scores then produced. Comparisons across across groups groups are then made using these alternativ e scores to ensure that changing our assumptions regarding full versus partial invariance does not lead to substantially different results. (Muthén and - Muthén, 1998 These MGCFA models are fitted using Mplus 2017 ) , with the final student weights applied an d standard errors clustered at the school level. A WLSMV estimator 7 categorical nature of the was used to account for the ordered with THETA parameterization 7 WLSMV is an estimator which is suitable for categorical variable s. It performs a probit regression using a robust weighted least squares estimator with a diagonal weight matrix. THETA parameterization allows the residual - variances of the latent trait to be parameters in the model, while excluding scale factors ( Muthén & Muthén, 1998 2017). 30

33 questions (Muthén et al. 2015). Bullshit scale scores are then derived from these models, with rage scores then compared across groups where scalar measurement invariance ave approximately holds. The bullshit scale has been standardised to mean zero and standard deviation within each country, so that all differences between groups can be interpreted in terms of an effect size. Gender Appendix Table A1 presents results from our measurement invariance tests for the bullshit scale across genders. All three levels of measurement invariance (configural, metric and scalar) hold according to our model fit crit eria within each Anglophone country. We consequently conclude that average bullshit scale scores can be legitimately compared between males and females within all nine Anglophone countries. Socio - economic status Appendix Table A2 presents the measurement i nvariance test results with respect to socio - economic status. With the exception of Northern Ireland, configural and metric invariance of the scale holds in every country according to both the CFI and TLI criteria (Northern Ireland passes the former for me tric invariance but fails according to the TLI). However, the picture is more mixed for full scalar invariance. Specifically, Australia and New Zealand fail to meet the full - e RMSEA scalar invariance criteria on two out of the three indices, while England fails th while Scotland and Northern Ireland fails the CFI. However, once we release the ‘proper number’ item parameters, partial scalar measurement invariance is met in all countries. Appendix Table A3 illustrates the difference between the full (top panel ) and partial (bottom panel) scalar results for comparisons across socio - economic groups. In all countries, we find a non - trivial and statistically socio - economic gap in average bullshit scores regardless of whether the full or partial invariance scale sco res are used. However, the magnitude of the gap is smaller based upon the partial invariance results. This is particularly the case in countries like Australia, Ireland, Scotland and New Zealand, where the difference in average bullshit scores between the top and bottom socio - economic group falls by around 0.2. Nevertheless, our key substantive finding that higher socio - economic status pupils are bigger bullshitters than their low socio - economic peers clearly continues to hold. Mathematics achievement dix Table A4 presents the measurement invariance test results with respect to Appen mathematics. Configural and metric invariance is found to hold for every country according to 31

34 both the CFI and TLI criteria. The same is not true, however, with respect to scalar invariance where most countries fail based upon the CFI and RMSEA. Consequently, the full scalar measurement invariance results for mathematics achievement quartiles appear to be particularly problematic. Further inspection of the data and item parameters revealed that higher achieving young people were disproportiona tely likely to say that they ha d heard of ‘proper numbers’. Consequently, once we release the threshold parameters for this item under the partial scalar invariance model, all three fit indice s take on much more reasonable values. However, Appendix Table A5 reveals that there are huge discrepancies between the full and partial scalar results with respect to differences between mathematics achievement group. In Ireland, for instance, the differ ence in the bullshit index between young people in the top and bottom mathematics achievement quartile is 0.87 according to the full scalar invariance results. ial Yet this drops to 0.11 standard deviations and becomes statistically insignificant for the part scalar results. A similar difference between the full and partial scalar results holds in most countries, with large declines in the estimated differences between the different achievement hit scale scores between high and groups. This indicates that most of the difference in the bulls low ability pupils is being driven by differences in responses to the ‘proper number’ variable. We consequently conclude from Table A5 that differences in average bullshit scores across mathematics achievement groups are n ot particularly robust, and therefore do not include these results within the paper. Immigrant status Appendix Table A6 turns to the measurement invariance results with respect to immigrant status. The fit indices reveal that all countries meet the confi gural and metric criteria, with most also satisfying the full scalar invariance test as well. The potential exceptions are England, New Zealand and Wales, where the RMSEA criteria for full scalar invariance is not satisfied (though the CFI and TLI criteria is). Moreover, for all nine countries, partial scalar invariance clearly holds. Appendix Table A7 illustrates that our key conclusions remain intact regardless of whether the full or partial scalar invariance scale scores are used. Specifically, we consi stently find immigrants to be bigger bullshitters than country natives in each country other than the United States. Moreover, for most countries, the magnitude of the difference between immigrants and and partial scalar scale scores. natives remains reasonably consistent between the full 32

35 Country The criteria we use for testing measurement invariance across countries is slightly different to the above. Specifically, as noted by (2014), the CFI and RMSEA cut - Rutkowski and Svetina ge whether measurement invariance holds are based upon an offs typically used to jud assumption of a small number of groups. This is typically violated in cross - national research, when the number of groups is large. Consequently, Rutkowski and Svetina (2014) suggested using slightly more liberal cut - off values for metric invariance tests when more than a handful of groups are being compared. With respect to this paper, they suggested that the change in the CFI between the configural and metric models should be less than 0.02 (rather than the usual 0.01). We follow this advice when testing for measurement invariance across countries. We began our measurement invariance tests for comparability across countries by including all nine countries within our MGCFA models. However, this reveal ed a significant problem with Wales, where there was evidence of a very poorly fitting model. Wales has therefore been - national comparisons, and all invariance tests based upon the excluded from our cross remaining eight countries. Results from these meas urement invariance tests are presented in Appendix Table A8. Configural, metric and scalar invariance is met according to the CFI and TLI criteria. Likewise, the change in the RMSEA between the metric and scalar models is sufficiently low to also indicate that full scalar measurement invariance has been met. Nevertheless, to illustrate the robustness of our results, we also produce partial scalar scores where the ‘proper number’ thresholds have been released across three countries (Australia, England and th e United States) which contributed the greatest change to the chi - squared statistic between the metric and scalar models. Cross country comparisons of average bullshit scores based upon the partial and full scalar - invariance models can be found in Appendix Tables 9 and 10. They both provide a similar ranking of countries and have the same pattern of statistically significant differences. Although there are some modest changes in terms of the magnitude of the differences between countries, overall these resu lts suggest that our conclusions regarding cross - country comparisons of bullshit scores are robust. 33

36 Table A1. Measurement invariance tests of the bullshit scale by gender CFI TLI RMSEA Config Metric Scalar Config Metric Scalar Config Metric Scala r 1.000 1.000 0.000 1.000 1.000 1.000 1.000 Australia 0.000 0.007 0.999 Canada 1.000 1.000 0.999 1.000 0.999 0.000 0.018 0.012 England 1.000 0.999 0.999 1.000 0.998 0.999 0.000 0.029 0.019 0.009 Ireland 1.000 1.000 0.999 1.000 1.001 1.000 0.000 0.000 New Zealand 1.000 1.000 1.000 1.000 1.000 1.000 0.000 0.000 0.000 Northern Ireland 1.000 1.000 1.000 1.000 1.004 1.000 0.000 0.000 0.003 1.001 Scotland 1.000 1.000 1.000 1.000 1.002 0.000 0.000 0.000 USA 1.000 0.998 1.000 0.999 1.000 0.999 0.000 0.018 0 .027 Wales 1.000 0.998 0.995 1.000 0.993 0.997 0.000 0.052 0.034 34

Related documents