Homophily () is a concept in sociology describing the tendency of individuals to associate and bond with similar others, as in the proverb "". The presence of homophily has been discovered in a vast array of network studies: over have observed homophily in some form or another, and they establish that similarity is associated with connection. The categories on which homophily occurs include age, gender, class, and organizational role.

The opposite of homophily is heterophily or intermingling. Individuals in homophilic relationships share common characteristics (beliefs, values, education, etc.) that make communication and relationship formation easier. Homophily between mated pairs in animals has been extensively studied in the field of evolutionary biology, where it is known as assortative mating. Homophily between mated pairs is common within natural animal mating populations.

Homophily has a variety of consequences for social and economic outcomes.

Status vs. value

In their original formulation of homophily, Paul Lazarsfeld and Robert K. Merton (1954) distinguished between status homophily and value homophily; individuals with similar social status characteristics were more likely to associate with each other than by chance:

  • Value homophily: involves association with others who have similar values, attitudes, and beliefs, regardless of differences in status characteristics. which account for a large proportion of inbreeding homophily (though classification by these criteria can be problematic in sociology due to fuzzy boundaries and different definitions of race).

Smaller groups have lower diversity simply due to the number of members. This tends to give racial and ethnic minority groups a higher baseline homophily. Race and ethnicity also correlates with educational attainment and occupation, which further increase baseline homophily. Despite discrimination and social exclusion, Roma migrants maintain diverse and specialized personal networks that extend beyond co-ethnic relations. Social support is distributed across different types of ties, with family, co-ethnic, and non-Roma contacts playing distinct and complementary roles. These findings challenge common assumptions that anti-Romani hostility necessarily results in ethnic closure or social isolation, and instead point to adaptive and heterogeneous networking strategies.

Sex and gender

In terms of sex and gender, the baseline homophily networks were relatively low compared to race and ethnicity. In this form of homophily men and women frequently live together and have large populations that are normally equal in size. It is also common to find higher levels of gender homophily among school students. Most sex homophily are a result of inbreeding homophily. It indicated a strong relationship between someone's age and the social distance to other people with regard to confiding in someone. For example, the larger age gap someone had, the smaller chances that they were confided in by others with lower ages to "discuss important matters." Additionally, as more users begin to rely on the Internet to find like-minded communities for themselves, many examples of niches within social media sites have begun appearing to account for this need. This response has led to the popularity of sites like Reddit in the 2010s, advertising itself as a "home to thousands of communities... and authentic human interaction".

Social media

As social networks are largely divided by race, social-networking websites like Facebook also foster homophilic atmospheres. When a Facebook user 'likes' or interacts with an article or post of a certain ideology, Facebook continues to show that user posts of that similar ideology (which Facebook believes they will be drawn to). In a research article, McPherson, Smith-Lovin, and Cook (2003) write that homogeneous personal networks result in limited "social worlds in a way that has powerful implications for the information they receive, the attitudes they form, and the interactions they experience." This homophily can foster divides and echo chambers on social networking sites, where people of similar ideologies only interact with each other.

Homophily in networks

In network science, homophily (also called assortative mixing) is the tendency of connected nodes to be similar in some attribute, such as sex, age, ethnicity, political preference, or vaccination status. In a network, nodes represent entities (for example, people) and links represent interactions (such as friendship, contact, or communication). Thus, a social network may show sex-based homophily if men are more likely to be linked to men and women to women, and an epidemic contact network may show vaccination homophily if vaccinated individuals interact disproportionately with other vaccinated individuals (and likewise for unvaccinated individuals). Homophilic network structure matters because it can change how processes spread on networks: in epidemic settings, vaccination homophily can weaken the population-level spillover of protection, raise the vaccine coverage needed for herd immunity, and increase outbreak size (with details depending on vaccine efficacy).

Coleman's Index

A standard way to quantify homophily in a network with categorical labels is given by assortativity coefficient <math>h</math>, which compares the observed fraction of same-type links to what would be expected under random mixing. If <math>e_{ij}</math> is the fraction of links that connect a node of type <math>i</math> to a node of type <math>j</math>, and <math>a_i = \Sigma_j e_{ij}</math> and <math>b_j = \Sigma_i e_{ij}</math> are the fraction of link-ends (stubs) attached to nodes of type/group <math>i</math> and <math>j</math>, respectively, then homophily can be defined as:

<math>h = \frac{\sum_i e_{ii}-\sum_i a_i b_i}{1-\sum_i a_i b_i},</math>

where in undirected networks, <math>a_i = b_i.</math> Here, <math>\Sigma_i e_{ii}</math> is the observed fraction of within-group links, and <math>\Sigma_i a_ib_i</math> is the expected fraction under random mixing with the same group-level link-end frequencies. Values near (1) indicate strong homophily, values near (0) indicate approximately random mixing, and negative values indicate heterophily (preference for dissimilar connections).

Local vs. global homophily

The assortativity coefficient, <math>h</math>, is only a global summary statistic (one number for the whole network), so it can hide important heterogeneity. Nevertheless, homophily can be defined and measured more locally, for example at the node/neighborhood level and at the group-size or clique level (within-group vs across-group homophily), which can lead to different predictions for connectivity and contagion even when the global assortativity is the same.

Causes and effects

Causes

Geography: Baseline homophily often arises when the people who are located nearby also have similar characteristics. People are more likely to have contact with those who are geographically closer than those who are distant. Technology such as the telephone, e-mail, and social networks have reduced but not eliminated this effect.

Family ties: These ties decay slowly, but familial ties, specifically that of domestic partners, fulfill many requisites that generate homophily. Family relationships are generally close and keep frequent contact though they may be at great geographic distances. Ideas that may get lost in other relational contexts, will often instead lead to actions in this setting.

Organizations: School, work, and volunteer activities provide the great majority of non-family ties. Many friendships, confiding relations, and social support ties are formed within voluntary groups. The social homogeneity of most organizations creates a strong baseline homophily in networks that are formed there.

Isomorphic sources: The connections between people who occupy equivalent roles will induce homophily in the system of network ties. This is common in three domains: workplace (e.g., all heads of HR departments will tend to associate with other HR heads), family (e.g., mothers tend to associate with other mothers), and informal networks.

Cognitive processes: People who have demographic similarity tend to own shared knowledge, and therefore they have a greater ease of communication and share cultural tastes, which can also generate homophily.

Effects

According to one study, perception of interpersonal similarity improves coordination and increase the expected payoff of interactions, above and beyond the effect of merely "liking others." Another study claims that homophily produces tolerance and cooperation in social spaces. However, homophilic patterns can also restrict access to information or inclusion for minorities.

Nowadays, the restrictive patterns of homophily can be widely seen within social media. This selectiveness within social media networks can be traced back to the origins of Facebook and the transition of users from MySpace to Facebook in the early 2000s. One study of this shift in a network's user base from (2011) found that this perception of homophily impacted many individuals' preference of one site over another. Most users chose to be more active on the site their friends were on. However, along with the complexities of belongingness, people of similar ages, economic class, and prospective futures (higher education and/or career plans) shared similar reasons for favoring one social media platform. The different features of homophily affected their outlook of each respective site.

The effects of homophily on the diffusion of information and behaviors are also complex. Some studies have claimed that homophily facilitates access information, the diffusion of innovations and behaviors, and the formation of social norms. Other studies, however, highlight mechanisms through which homophily can maintain disagreement, exacerbate polarization of opinions, lead to self segregation between groups, and slow the formation of an overall consensus.

As online users have a degree of power to form and dictate the environment, the effects of homophily continue to persist. On Twitter, terms such as "stan Twitter", "Black Twitter", or "local Twitter" have also been created and popularized by users to separate themselves based on specific dimensions.

Homophily is a cause of homogamy—marriage between people with similar characteristics. Homophily is a fertility factor; an increased fertility is seen in people with a tendency to seek acquaintance among those with common characteristics.

Heterophilic Graph Learning

In graph representation learning, homophily means that nodes with the same label or attributes are more likely to be connected in the network. It is considered the most important reason for the success of graph neural networks (GNNs). However, it is recently found that there exists a non-trivial sets of dataset where homophily principle doesn't hold, i.e. heterophily, and GNN performance is unsatisfactory. Lots of efforts have been devoted to study heterophilic graph learning.

See also

  • Groupthink
  • Echo chamber (media)

References