Algorithmic personalization: a study of knowledge gaps and digital media literacy
Introduction
In today’s era, there is a notable surge in the creation of personalized online messages tailored to individuals’ diverse characteristics and interests. Advances in technology now allow for real-time customization of media messages, sparking varied opinions and concerns regarding personalized news content and services among participants in the media ecosystem (Gran et al. 2021). These ecosystem actors’ diverse orientations and perceptions towards personalization stem from many factors, reflecting its complex implications (Segijn and Van Ooijen 2022). From a market perspective, personalization is critical in segmenting audiences and devising segmentation strategies. Effective market segmentation significantly impacts enterprise profitability and is heavily reliant on the formulation and execution of marketing strategies. These strategies benefit from the wealthy demographic and psychographic data, offering businesses a competitive edge by prioritizing customer satisfaction, a key performance indicator in business operations (Zhao 2021). The strategic acquisition of data on customer purchase behavior, message preferences, and decision-making processes is crucial, intersecting legislative, technological, and socio-psychological domains (Chandra et al. 2022). Concurrently, increasing public apprehension regarding the misuse of collected data stresses the necessity for comprehensive research into attitudes toward personalized services and awareness levels (Siraj 2021; Lim and Zhang 2022). Understanding these facets is essential for enhancing trust in media services and establishing the regulatory and protective frameworks that support the media ecosystem’s growth.
In this research, personalization of content refers to the adaptation of information according to the user’s individual needs, preferences, or behavior. This process ensures that users encounter content aligned with their interests or needs, based on data such as search history, online behavior, and demographics. Understanding personalized online content is crucial for societal information security (Xu et al. 2022). Individuals who are informed about how personalization operates and the data it involves can better evaluate privacy risks and protect their information. Awareness can also guard against manipulation through tailored content, encouraging a critical stance towards received information and resistance to dubious sources. Knowledgeable users can adjust their online behavior to influence content personalization algorithms, aiming for more balanced and varied information. Additionally, comprehension of these algorithms heightens vigilance against mis/disinformation and enhances response effectiveness (Aiolfi et al. 2021). Evaluating public knowledge on personalized content hinges on awareness and comprehension of its principles, technologies, risks, and benefits (Aguirre et al. 2016). Key factors influencing this assessment include understanding personalized content concepts; familiarity with the technologies and algorithms behind recommendations; awareness of privacy and security risks tied to personal data use for content personalization; evaluating personalized versus general content’s pros and cons; and personal data management in the context of content personalization. These elements shape the public’s knowledge scope regarding personalized content and their readiness to engage with digital media content thoughtfully (Paul et al. 2023).
Enhancing public understanding of the mechanics and dangers associated with personalized online content is crucial for bolstering societal information security and resilience against manipulation and disinformation efforts (Buiten 2022). It can be argued that the impact of content personalization on information security lies in the risks and threats associated with the collection, storage, and use of personal data to customize content. This highlights the focus of our research on analyzing public attitudes toward personalized services. Methodologically, our study aims to inform decision-making by assessing the awareness of the Czech Republic’s populace across different social strata regarding personalized content facilitated by artificial intelligence in the digital media landscape. The findings are instrumental in developing a media literacy framework that embodies a comprehensive, multidimensional approach. The efficacy of this framework hinges on exploring a spectrum of socio-demographic, socio-political, ideological, and psychological factors, alongside examining intricate causal relationships within these domains. Understanding these elements necessitates a thorough examination of how different population segments perceive and interact with various aspects of societal systems and geopolitical changes. It also involves analyzing societal transformations and the effects of disinformation campaigns.
The primary aim of this research is to develop an information-analytical framework that can assess the knowledge levels of different social classes regarding personalized content within the digital media ecosystem.
This research contributes to digital media literacy by introducing a novel information-analytical system. Comprising an information model, fuzzy logic approach, and social class disparity analysis, this system enables a deeper understanding of public knowledge about personalized content. It extends beyond basic awareness, examining user preferences, trust levels, and desired control mechanisms. By highlighting significant social class disparities in digital media literacy, the study informs targeted educational interventions and opens new avenues for research on the ethical and socio-political implications of personalized media services.
The methodology supports a comprehensive analysis of knowledge levels regarding personalized content across various demographic segments and social classes, providing insights into demographic-based distinctions. The core hypothesis suggests that within a particular social class, a high degree of awareness about content personalization processes, technical methods, and control over online content correlates with substantial knowledge about personalized content in the digital media ecosystem. This correlation is assessed region-wide by the developed information-analytical system.
An Overview of Research Studies
Social media platforms deploy algorithms to curate content, tailoring user experiences individually. However, many users engage with these platforms unaware of the personalization processes and their implications, highlighting the necessity for enhanced literacy among social media enthusiasts (Asraful et al., (2018)). Research on personalized services has primarily focused on individual perceptions, emphasizing the significance of socio-demographic factors alongside the potential risks stemming from inadequate media and algorithmic literacy (Fletcher and Nielsen 2019). Segijn and Van Ooijen (2020) explored personalization techniques among the American populace, finding techniques like psychographic segmentation, hashtag tracking, and geofencing universally deemed intrusive and associated with surveillance. Interestingly, younger individuals appeared more amenable to personalization, favoring reduced oversight compared to older demographics. Contrarily, some studies flag the advantages of personalized messaging tailored to specific interests and characteristics, advocating for its efficiency and the strategic exclusion of non-targeted demographics (Rund – AdAge (2018); Kumar and Gupta 2016; Maslowska et al. 2016). Despite recognizing personalization’s benefits, concerns persist over privacy and surveillance, with users apprehensive about their data being misappropriated by various entities (Phelps et al., (2000); Phelan et al. 2016), indicating that the extent of these apprehensions varies across different demographic segments.
The reluctance to accept personalized messages might stem from a lack of understanding about the personalization process among users, despite a broad awareness of cookies and their functionalities (Ur et al. 2012; Segijn, 2019). Segijn and Van Ooijen (2020) found that attitudes towards personalization and concerns about surveillance significantly predict the acceptance of personalization techniques. Investigating the varied acceptance rates of these techniques can uncover the underlying factors that lead to the adoption of certain methods over others. This discrepancy could be tied to socio-demographic characteristics with psychological underpinnings, potentially indicating varying levels of trust in personalization tools based on users’ awareness (Liu and Tao 2022; Kalyanaraman and Sundar 2006). In analyzing attitudes towards service personalization, the concept of privacy cynicism plays a crucial role. Privacy cynicism, related to increased mobile device usage, describes a cognitive stance where users overlook privacy concerns, engaging in online activities without bolstering privacy measures (Hoffmann et al. 2016; Taddicken 2014). Research highlights certain demographic groups’ hesitance towards online data collection, fearing how businesses might exploit personal data for targeted advertising. This apprehension can lead to advertising aversion, influencing future consumer behaviors and individual profile development (Kang and Shin 2016; Kokolakis 2017), pointing to the complex interplay between personalization acceptance, privacy concerns, and behavioral responses.
Research has identified socio-demographic characteristics as key in influencing perceptions towards personalized service techniques and concerns about intensive monitoring or misuse of personal data (Van der Goot et al., (2018)). Smit et al. (2014) found that older individuals tend to view personalized advertising more negatively than younger counterparts, exhibiting a higher propensity for privacy protection. Gender differences also contribute to varying perceptions of personalized content, with females generally expressing greater concern over the use of personalized tools and prioritizing privacy more than males. However, studies present conflicting findings, with some suggesting men are more cautious and resistant to personalization tools (Milne et al., (2023)), while others report no significant gender differences (Boerman et al. (2017)). These discrepancies show the need for further examination of socio-demographic factors in relation to literacy levels, ideological beliefs, information resource preferences, attitudes towards information technology, and consumer behavior. The concept of the personalization paradox and tradeoff in personal data protection offers insight into the acceptance of personalization against a backdrop of privacy concerns (Karwatzki, et al. 2017; Siraj 2021). This paradox examines the balance between the costs, such as privacy concerns (Awad and Krishnan 2006), perceived surveillance (Segijn 2019), and the benefits of receiving more relevant content and personal discounts (Kalyanaraman and Sundar 2006; Alfnes and Wasenden 2022). The interplay of these factors significantly impacts individuals’ decisions to share personal data, highlighting the complexity of consumer attitudes towards personalized services and the need for a deeper understanding of these dynamics (Wottrich et al. 2018).
The decision-making process is notably impacted by consumers’ perceptions of the ethical practices of companies engaging in personalization. The use of synchronized advertising and personalized strategies can amplify concerns about surveillance and privacy. A careful consideration of privacy involves weighing the potential costs against the benefits, leading to rational decision-making. This calculus is at the heart of the privacy paradox, highlighting the dissonance between widespread privacy concerns and the lack of protective actions taken by individuals (Norberg et al., (2007); Taddicken 2014; Kokolakis 2017). Liang et al. (2006) shed light on the positive reception of personalized services, attributing it to user satisfaction driven by several key theories: information overload, use and satisfaction, and user involvement. Personalized services can mitigate information overload, enhancing user satisfaction by aligning content and products with individual preferences. This alignment not only alleviates customer fatigue and shortens decision-making time but also lessens cognitive load (Chandra et al. 2022). The effectiveness of personalization is most pronounced among users with specific objectives or seeking particular knowledge, rather than those browsing for general information.
The exploration of information overload traverses diverse research paths, with studies variably focusing on psychological aspects and impacts on user behavior (Liang et al. 2006). Ho and Tang (2001) identify three key contributors to information overload: quantity, quality, and format of information. To mitigate overload, various strategies are suggested, such as employing knowledge maps for website navigation (Chung et al. 2005), adopting infomediary models (Ho and Tang 2001), and utilizing a Content-Based Recommendation Optimization Algorithm (CROA) for personalized knowledge services (Sun et al. 2019). Berghel et al., (1999) proposes five strategies for managing information overload: search engines, information agencies, customization, brand identification, and information push strategies. The motivation behind using personalized services also plays a crucial role in how these services are perceived and accepted. The theory of use and satisfaction indicates that user satisfaction with personalized services varies significantly based on the underlying motivation – social interaction motives tend to yield higher satisfaction compared to motives like escape, fun, or leisure. Aguirre et al. 2016 discuss the personalization paradox, which emerges from the transparency of data collection efforts by businesses. Open data collection practices may lead to greater acceptance of personalized advertisements, in contrast to more covert data collection methods. Developing marketing strategies that build trust can either enhance user receptivity to personalized services or, conversely, heighten feelings of vulnerability and reduce the adoption of such services.
The utilization of personalized persuasive technologies represents a potent strategy within personalized service offerings. Kaptein et al. (2015) delve into the dichotomy of implicit and explicit personalization via predictive profiles, positing that persuasive systems might outperform human persuasiveness due to computers’ greater persistence (Alslaity et al. 2023). This notion is particularly explored within education and nutrition sectors, where the efficacy of computer-tailored health interventions has been under scrutiny. Studies consistently demonstrate that these computerized interventions significantly outpace traditional methods in promoting healthy eating behaviors (Parekh et al. 2012). Boerman et al. (2017) shed light on online behavioral advertising (OBA), also known as online profiling or behavioral targeting. OBA, a method of personalizing advertisements based on user online activities, aims to enhance the personal relevance of ads. This approach, however, often involves covert tracking and information collection, leading to consumer concerns over ethicality and harm (Stanton et al. 2017; Behera and Bala 2023). The implications of OBA for personal data protection continue to attract the attention of policymakers globally (Aguirre et al. 2016). Chen and Stallaert (2014) affirm OBA’s superior effectiveness over non-targeted advertising. Aguirre et al. 2016 highlight OBA’s success particularly among younger individuals who are experienced online, exhibit lower privacy concerns, and have specific preferences. The use of behavioral targeting is prevalent across devices, blurring the lines between online and offline experiences.
These findings demonstrate the necessity for a comprehensive examination of the impacts of various factors on perceptions of personalized services. Such an analysis could reveal novel insights and inform the development of analytical tools essential for advancing media literacy systems.
The study is structured to systematically address the examination of public understanding and perception of personalized content within the digital media landscape. Part 3 outlines the formal structure of the task alongside the development of an information-analytical system designed to assess population knowledge levels regarding personalized content. This system is compartmentalized into three sequential stages: initially, an information model is introduced to evaluate the public’s comprehension of personalized content; subsequently, a fuzzy methodology is applied to ascertain the collective knowledge level concerning personalized content within the digital media ecosystem; and finally, this approach is utilized to gauge the understanding of personalized content across various social strata. Part 4 details the application and validation of the information-analytical system using empirical data gathered from the Czech Republic’s populace. Part 5 then transitions to a discussion of the research findings, presenting both scientific insights and practical implications derived from the study. Concluding the paper, Part 6 not only recapitulates key outcomes but also proposes directions for future inquiry into personalized content, signaling avenues for advancing media literacy and enhancing the comprehension of personalized digital media services.
Materials and Methods
The information-analytical system for evaluating the level of knowledge among different social classes regarding personalized content within the digital media ecosystem undergoes verification and testing using actual data from the Czech Republic’s populace (Data from 1213 participants, 2024).
A total of 1975 respondents were contacted, with 1346 completing the online questionnaire between 20 February 2023 and 27 February 2023 using the CAWI technique, resulting in a 68.2% return rate. After evaluating response quality, 16 respondents were excluded, and 117 did not complete the questionnaire. Ultimately, data from 1213 respondents were collected, providing insights into their understanding of personal autonomy and the right to informational self-determination in the context of AI-driven personalized media content. The sample was selected using quota sampling from the adMeter respondent panel, targeting the Czech population aged 15 and over, with quotas based on sex, age, education, and region. The questionnaire design was informed by Volek et al. (2023) and expert consultations. On average, respondents spent 17 min and 51 s completing the questionnaire.
Participants engaged in completing a research questionnaire comprising 19 comprehensive questions. The primary objective of gathering this data was to glean insights into the population’s educational level on personalization, their preferences towards it, how they perceive its impact on their rights, their trust level in personalization, and under what conditions they would be inclined to trust it further. Additionally, we aimed to understand the control mechanisms they advocated and their views on related issues, including the monitoring of their online activities.
The statistical data we obtained adhere to all the criteria for sample formation, as evidenced by the representativeness of the selected respondents. They embody a comprehensive group of subjects under study and align with diverse demographic characteristics. Out of 1213 respondents, 604 are men and 609 are women. A significant 81.8% of these participants are of working age, with 65% possessing education and 25.7% holding higher education degrees. Predominantly, our sample comprises working-age individuals actively engaging with the digital media ecosystem, evaluating, and interacting with personalized content.
Formal formulation of the evaluation problem
Consider a region R and a population (C={{c}_{1};{c}_{2};ldots ;{c}_{n}}) of this region. The population has a certain social profile defined by a set of descriptions of demographic characteristics (S={{s}_{1};{s}_{2};ldots ;{s}_{m}}). Different combinations of demographic characteristics form different social classes ({ST}={{{ST}}_{1};{{ST}}_{2};ldots ;{{ST}}_{k}}). A social class is a group of people in society who share a similar socioeconomic status, defined by factors such as income level, education, occupation, lifestyle, and access to resources. It is necessary to assess the level of knowledge of the population of different social classes regarding personalized content in the digital media ecosystem. For this purpose, the population will be given a questionnaire with a set of text questions K (evaluation criteria), grouped into G groups, to determine the level of knowledge about personalized content. The received answers, together with the data processing system, represent an information model for evaluating the population’s knowledge of personalized content ({M}_{{kpc}}). The input data processed according to the information model are calculated using a fuzzy method for determining the population aggregated level of knowledge about personalized content in the digital media ecosystem – ({{FM}}_{{kpc}}). Further, the estimates of the population aggregated knowledge levels are transferred to the calculation of the fuzzy method of deriving the level of knowledge regarding personalized content for different social classes of the population – ({{SM}}_{{kpc}}).
For the formal formulation of the research task, the information-analytical system for assessing the level of knowledge of the population of different social classes regarding personalized content in the digital media ecosystem is proposed to be presented in the form of an operator:
Where ({rm X}) is an operator that, based on the input data (R,C,{ST},{M}_{{kpc}},{{FM}}_{{kpc}},{{SM}}_{{kpc}}) matches the set of output values ({Upsilon }). assessment of the aggregate level of knowledge of the population about personalized content in the digital media ecosystem; KPC – is the linguistic level of knowledge of the population about personalized content in the digital media ecosystem for different social classes.
Based on the obtained raw data, the level of knowledge about personalized content in the digital media ecosystem for different social classes is determined. The knowledge obtained in this study will be useful for various management subjects in the digital media ecosystem to make management decisions.
The information-analytical system relies on expert knowledge, leading to the introduction of specific management roles for this study: respondents – citizens who shared their knowledge about content personalization; a system analyst – responsible for configuring all processes within the information-analytical system; and a decision-maker (DM) – a management entity that uses the gathered knowledge to make informed decisions.Next, a structural diagram of the information-analytical system for assessing the level of knowledge of population of different social classes regarding personalized content in the digital media ecosystem is presented (Fig. 1).

Structural diagram of the information-analytical system.
Figure 1 shows the three-stage structural diagram of the information-analytical system for assessing the knowledge level of the population of different social classes regarding personalized content in the digital media ecosystem. At the first stage, the input data collected from the population on the basis of the indicators of the assessment of the level of knowledge about the personalization of the content are submitted to the input. The received data are processed using the ({M}_{{kpc}}) information model, after which they form the research database. In the second stage, the received data are processed by the fuzzy method (({{FM}}_{{kpc}})) of determining the population aggregated level of knowledge regarding personalized content in the digital media ecosystem and forming the research knowledge base. In the third stage, the demographic characteristics of the population are considered. Here, the DM defines the social classes of the STs to be studied. Further, the data on the defined social class and the aggregated knowledge levels of the population enter the fuzzy method ({{SM}}_{{kpc}}) for processing. After that, a normalized evaluation and the linguistic level of the population knowledge of personalized content for a defined DM of a social class is obtained. The DM analyzes the received initial ratings and makes a decision or revises the rating by changing the settings or social class.
Information-analytical system for assessment of the level of knowledge of the population of different social classes about personalized content in the digital media ecosystem
The information-analytical system is presented in the form of three declared stages.
The first stage of the information-analytical system
An information model for assessment of the population’s knowledge of personalized content – ({M}_{{kpc}}) is given. The assessment of the population’s knowledge of personalized content in the digital media ecosystem is proposed on the basis of criteria divided into three groups.
({G}_{1}) – Awareness of how personalization of content occurs on the Web. Consideration of how it may differ depending on the type of content.
The criteria of the first group are statements about awareness of personalization of content on the Internet. According to the given statements, the respondents have to give a score from the set [1; 10]. Where 1 point will characterize “completely disagree” with the statement, and vice versa, if the number of points approaches 10, it will characterize “completely agree” with the statement. The rating scale characterizes the level of awareness of content personalization on the Internet. An open-ended set of statements is provided below.
({K}_{11}) – On the Internet, websites display the same content to all users.
({K}_{12}) – When I search using available search engines (Google, List, etc.) I see the same content as everyone else searching for the same keyword or phrase at the same time.
({K}_{13}) – When I go to the web pages of the servers of government institutions (ministries, authorities), I see the same content as all other users of these pages.
({K}_{14}) – When I go to the web pages of political party servers, I see the same content as other users of those sites.
({K}_{15}) – When I go to news server websites (novinky.cz, idnes.cz, etc.), I see the same content as all other users of those pages.
({K}_{16}) – When navigating web pages on media servers (Netflix, streaming, etc.), I see the same content as everyone else on those sites.
({K}_{17}) – When I go to lifestyle server websites (knitting, fashion, cars, sports, etc.) I see the same content as everyone else on those sites.
({K}_{18}) – When I go to the websites of sales servers (online stores), I see the same content as all other users of those sites.
({K}_{19}) – When I go to social networking websites (Facebook, LinkedIn, etc.), I see the same content as everyone else on those sites.
({K}_{110}) – When I go to a person’s personal website (e.g., an artist, scientist, journalist, etc.), I see the same content as everyone else on that website.
({K}_{111}) – When I go to professional websites for small and medium businesses (like a car service website, etc.), I see the same content as everyone else on that website.
({K}_{112}) – When I go to professional websites of large companies (such as an international company’s website, etc.), I see the same content as everyone else on the site.
After the survey, point estimates are obtained that are represented by ({O}_{11},{O}_{12},ldots ,{O}_{112}) according to the criteria ({K}_{11},{K}_{12},ldots ,{K}_{112}).
First, for the first group of criteria, one general assessment should be obtained regarding the awareness of citizens about the personalization of content on the Internet, depending on its type:
Where (l) is the number of criteria in group ({G}_{1}) (for the proposed number of criteria (l=12)), i is the number of the respondent, (i=overline{1,n}).
Fuzzification of the input data is proposed using intelligent knowledge analysis in the form of a harmonic Z-spline of the membership function. For the proposed number of criteria (l=12), the membership function will have the form:
Where ({lambda }_{1}({c}_{i})) is the total number of points obtained for the first group of criteria, i is the m respondent, (i=overline{1,n}.)
({G}_{2}) – Awareness of the technical way of personalizing content on the Internet.
For this group of criteria, a multiple-choice answer to the question is offered.
How do you think content is adapted (or personalized) online?
({K}_{21}) – An online service/website puts me in a group of people with similar interests who then offer the same content.
({K}_{22}) – An online service/website that tracks my online behavior, creates a personal profile of my personality, and then offers me tailored content based on that/
({K}_{23}) – Based on various classifiers, the online service/website predicts how likely I am to take a certain action (e.g. click a link) and offer me content accordingly.
({K}_{24}) – Online service/website combines all the above procedures.
({K}_{25}) – Content setting works differently than stated.
From one to five answers will be received from the respondents. All answers reflect awareness of the technical way of personalizing content on the Internet.
In this case, fuzzification of the input data is carried out based on the number of received responses, using the following characteristic function:
({G}_{3}) – Awareness of the level of control over online content.
It looks at statements that aim to find out how much control you feel you have over what content you see online. Here you need to choose one of the statements that best describes your view and assign a score. Moreover, if the suggested desired answer does not exist, then the respondents adjust the assigned score to get the value as close as possible to the truth. Of course, the scale of such ratings is heuristic, while characterizing the level of control over online content.
({K}_{3}) – How much control do you feel you have over what content you see online?
-
don’t know (0 points);
-
the provider has no control over how online content is displayed to me. All control is in my hands (10 points);
-
have more control over how online content is displayed to me than the content provider (15 points);
-
have the same control over how online content is shown to me as the content provider (20 points);
-
have little control over how online content is displayed to me. The content provider has more control (25 points);
-
have no control over how online content is displayed to me. This is completely in the hands of the content provider (30 points).
After the evaluation, a score is obtained, which is denoted as ({O}_{31}). For its fuzzification, it is proposed to use intellectual analysis of knowledge and an S-shaped membership function:
Thus, at the first stage, a set of information indicators was provided for the assessment of citizens’ knowledge of personalized content and approaches to their standardization. It is noted that this set is open, and further calculations do not depend on their number.
The second stage of the information-analytical system
The second and third stages of the information and analytical system are presented in the form of a step-by-step algorithm.
At the second stage of the model, processed normalized input data enters the calculation using a fuzzy method for determining the aggregated level of knowledge of citizens regarding personalized content in the digital media ecosystem.
1st step. Entering weighting coefficients for each criterion group
Let DM set the weighting coefficients for each group of criteria ({{alpha }_{1},{alpha }_{2},{alpha }_{3}}) from some interval [1; 10]. In the event that there is no need to distinguish the importance of groups of criteria, the weights are considered equally important. In the theory of fuzzy sets, the power set [0;1] is used, so the weighting factors must also be normalized:
2nd step. Deriving the aggregate level of citizens’ knowledge about personalized content in the digital media ecosystem
For this, one of the convolutions is used in the section separately by citizens ({c}_{i}) ((i=overline{1,n})):
Thus, aggregated assessments of citizens’ knowledge about personalized content in the digital media ecosystem are obtained, namely: awareness of content personalization processes on the Internet, awareness of the technical way of personalization, and awareness of the level of control over online content. The selection of the type of convolution determines the DM, and the obtained aggregated estimate yST (сi) ∈ [0;1]. Moreover, when yST (сi) → 1, then this determines a high level of knowledge about personalized content in the ecosystem of digital media.
The third stage of the information-analytical system
At the final stage, a fuzzy method of deriving the level of knowledge regarding personalized content for different social classes of citizens is presented – ({{SM}}_{{kpc}}).
3rd step. Considering demographic characteristics of citizens
First, for each citizen, it is necessary to consider his demographic characteristics of the social profile (S={{s}_{1};{s}_{2};ldots ;{s}_{m}}), with formalized answers. The three most common ones are offered.
-
1.
Gender – ({sigma }_{1}).Here, the linguistic evaluation is as follows: ({s}_{1}=) “man”; ({s}_{2}=) “woman”.
-
2.
Age – ({sigma }_{2}).
For this demographic assessment, the value of the unified intervals regarding the number of years of citizens is taken:
s3 = “If from 15 to 24”.
s4 = “If from 25 to 34”.
s5 = “If from 35 to 44”.
s6 = “If from 45 to 54”.
s7 = “If from 55 to 64”.
s8 = “If from 64 to 74”.
s9 = “If more 75”.
-
3.
Education – ({sigma }_{3}).
For this demographic characteristic, the highest completed education is considered:
s10 = “Unfinished basic education”.
s11 = “Basic education”.
s12 = “Secondary general education (without a diploma of secondary education)”.
s13 = “Full secondary general (with high school diploma)”.
s14 = “Professional education”.
s15 = “Higher education (bachelor, master)”.
4th step. Deriving citizens’ knowledge about personalized content for different demographic characteristics
The following is suggested for this: the arithmetic mean of the obtained aggregated estimates of the knowledge of citizens regarding personalized content in terms of demographic characteristics is calculated:
Where ({s}_{j}) is a demographic characteristic, (j=overline{1,m}); ({r}_{j}) is the number of citizens belonging to the demographic characteristic ({s}_{j}).
Therefore, the determined values of ({psi }_{{ST}}({s}_{j})) characterize the aggregated level of knowledge of selected citizens of the corresponding demographic characteristic ({s}_{e}) regarding personalized content.
5th step. Determining the level of knowledge of citizens regarding personalized content in the digital media ecosystem for different social classes
Such social classes are formed due to the grouping of demographic characteristics (S) according to some rule (T). DM defines this rule of inclusion of various demographic characteristics. For example, a combination of three demographic characteristics for gender, age, and education is suggested. Thus, one value from the set is obtained, which is denoted as follows: ({sigma }_{1}^{ast }={{s}_{1};,{s}_{2}};,{sigma }_{2}^{ast }={{s}_{3};,{s}_{4};ldots ;{s}_{9}};,{sigma }_{3}^{ast }={{s}_{10};,{s}_{11};ldots ;{s}_{15}}).
Next, intelligent knowledge analysis based on multidimensional membership functions is used. Uncertainties of the “average value” type are modeled in three-dimensional space, and it is proposed to use a cone-shaped or pyramidal membership function. Moreover, the value of the center of the base will be a unit vector, and the scaling will be based on the coordinates (left(3;3;3right)).
The conical membership function is described by the formula:
The choice of the type of membership function is up to the system analyst. Of course, sometimes this leads to slight ambiguities in the result, but if taken as a whole, it does not affect the results’ reliability.
6th step. Derivation of the linguistic level of knowledge
In conclusion, ({KPC}) is derived. For this purpose, the obtained ({C}_{{KPC}}) estimates are mapped to one term-set variable ({KPC}={{{kpc}}_{1},{{kpc}}_{2},…,{{kpc}}_{5}}) of knowledge of citizens regarding personalized content in the digital media ecosystem as follows: ({C}_{{KPC}}in) (0,89; 1] – ({{kpc}}_{1}) = “high level”; ({C}_{{KPC}}in) (0,77; 0,89] – ({{kpc}}_{2}) = “the level is above average”; ({C}_{{KPC}}in) (0,65; 0,77] – ({{kpc}}_{3}) = “average level”; ({C}_{{KPC}}in) (0,54; 0,65] – ({{kpc}}_{4}) = “low level”; ({C}_{{KPC}}in) [0; 0,54] – ({{kpc}}_{5}) = “very low level”.
The distinctions between the levels rely on the systems analyst using real research data. Our research used a database of 1,213 respondents’ answers to questions contained in the information model of evaluation – ({M}_{{kpc}}).
Results
The study was conducted using the developed information analysis system on a complete set of data with the aim of determining the level of knowledge of the population regarding personalized content in the digital media ecosystem.
In order for other researchers or managers to be able to reproduce this study, a sample evaluation of data fragments from 124 respondents for the Jihomoravský region (data from 1213 participants 2024) is provided.
In the first stage, based on the ({M}_{{kpc}}) information model, the population’s knowledge of personalized content is evaluated. Fragments of input data received from the population are shown in Table 1.
After that, the fuzzification procedures are performed using formulas (3)-(5), and the results are shown in Table 2.
The calculation of the second and third stages is presented in the form of a step-by-step algorithm.
1st step. Entering weighting coefficients for each criterion group
First, the DM sets the weighting coefficients for each group of criteria ({alpha }_{1}) = 9, ({alpha }_{2}) = 10, ({alpha }_{3}) = 8. The normalized weight coefficients are according to formula (6): ({beta }_{1}) = 0.33, ({beta }_{2}) = 0.37, ({beta }_{3}) = 0.3.
2nd step. Deriving the aggregate level of citizens’ knowledge about personalized content in the digital media ecosystem
For example, using the average convolution according to formula (9): yST3 (с2) = 0.723; yST3 (с3) = 0.51; yST3 (с11) = 0.858; …; yST3 (с1209) = 0.654.
Therefore, aggregated estimates of the population’s knowledge of personalized content in the digital media ecosystem in the Jihomoravský region are obtained.
3rd step. Considering demographic characteristics of citizens
First, for each inhabitant, the demographic characteristics of the social profiles ({S}_{1},{S}_{2},{S}_{3}). are considered. Fragments of data on the demographic characteristics of the population are shown in Table 3.
4th step. Deriving citizens’ knowledge about personalized content for different demographic characteristics
In addition, the arithmetic mean of the obtained aggregated estimates of the population’s knowledge of personalized content in the Jihomoravský region in terms of demographic characteristics is calculated according to formula (11): ({psi }_{{ST}}({s}_{1})=,)0 .649; ({psi }_{{ST}}({s}_{2})=,)0.646; ({psi }_{{ST}}({s}_{3})=,)0.735; ({psi }_{{ST}}({s}_{4})=,)0.639; ({psi }_{{ST}}({s}_{5})=,)0.629; ({psi }_{{ST}}({s}_{6})=,)0.634; ({psi }_{{ST}}({s}_{7})=,)0.681; ({psi }_{{ST}}({s}_{8})=,)0.638; ({psi }_{{ST}}({s}_{9})=,)0.68; ({psi }_{{ST}}({s}_{11})=,)0.674; ({psi }_{{ST}}({s}_{12})=,)0.62; ({psi }_{{ST}}({s}_{13})=,)0.62; ({psi }_{{ST}}({s}_{14})=,)0.121; ({psi }_{{ST}}({s}_{15})=,)0.697. It is noted that for the demographic characteristic, ({s}_{10}) has no respondents for the studied region.
5th step. Determining the level of knowledge of citizens regarding personalized content in the digital media ecosystem for different social classes
Next, a calculation is made to determine the population’s level of knowledge about personalized content in the digital media ecosystem for different social classes. For example, the DM defines the following social class, which is formed by the population: males with higher education and age between 35 and 44 years. Thus, a value is obtained from the set: ({sigma }_{1}^{* }=left{{s}_{1}right};) ({sigma }_{2}^{* }=left{{s}_{5}right};) ({sigma }_{3}^{* }=left{{s}_{15}right}).
In order to derive the level of knowledge of the population in terms of personalized content for the above social class, the intellectual analysis of knowledge and a conical multidimensional function of belonging are used, according to the formula (12): (Delta =frac{1}{3}cdot sqrt{{left(0.649-1right)}^{2}+{left(0.629-1right)}^{2}+{left(0.697-1right)}^{2}}=0.188.) ({C}_{{KPC}}=)0.812.
6th step. Derivation of the linguistic level of knowledge
In conclusion, KPC is derived – the linguistic level of knowledge of the population regarding personalized content in the digital media ecosystem for the social class: men with higher education and aged 35 to 44 years. For this, the obtained ({C}_{{KPC}}) score is compared with a variable of ({KPC}) the term sets, and the following is obtained: ({C}_{{KPC}}in) (0,77; 0,89] – ({{kpc}}_{2}) = “the level is above average”.
Similarly, the level of population knowledge about personalized content in the digital media ecosystem is calculated for all possible social classes. The calculations are shown separately in Table 4 for ({s}_{1}) = “men” and in Table 5 for ({s}_{2},)= “women”.
Analyzing the data obtained, we can see that the level of knowledge of the population regarding personalized content in the digital media ecosystem is at an average to above-average level. The lowest scores are given by men and women aged 35–44 and with professional education. Men and women aged 15 to 24 and with higher education have the highest scores. This indicates that young people with higher education exhibit a higher level of awareness in three key areas: understanding content personalization processes on the Internet, the technical methods used for personalization, and the degree of control over online content.
Discussion
The digital transformation and rapid growth of social media have ushered in a dual-edged reality of opportunities and risks, highlighting the importance of understanding information processes and acquiring media consumption skills. This necessitates the development of media literacy frameworks, with a crucial component being algorithmic literacy (that is the ability to understand, create, and effectively use algorithms), which encompasses understanding the algorithmic mechanisms that influence user decisions regarding personalized services. The impact of daily experiences on algorithmic literacy varies across different demographic groups (Swart 2021; Du 2023). Zarouali et al. (2021) found that misconceptions about algorithms are predominantly observed among the elderly, less educated individuals, and women. Assessing the public’s understanding of personalized services can pinpoint vulnerable groups needing priority in media literacy educational programs. Interestingly, younger individuals and those heavily reliant on mobile technology are generally more open to personalization techniques, showing less concern over data misuse, a trend also noted by Segijn and Van Ooijen (2020).
The necessity for media literacy programs is highlighted across multiple aspects. Beyond offering a safeguard, these programs are crucial in equipping users with strategies to combat information overload, optimize the utilization of data and information sources, achieve greater satisfaction, and discern sources of disinformation, including the potential for their spread.
Media literacy programs play a crucial role in addressing the personalization paradox, wherein increased personalization enhances the relevance and acceptance of services among consumers, yet paradoxically may also heighten feelings of vulnerability and lower acceptance levels (Aguirre et al. 2016). The adoption of algorithmic systems varies across countries, organization types (public service media vs. private media), economic resource availability, and organizational culture, which hinders the establishment of unified methods for adopting personalized services (Mitova et al., (2023); Walters 2022; Kozyreva et al. 2021; Caplan and Boyd 2018). The nature of personalized services significantly affects their acceptance. Sehl and Eder (2023) found that while personalization in political advertising and news sources was largely rejected in Germany and the UK, personalized commercial advertising and entertainment recommendations were more favorably received. Collecting data on interests, location history, and religious or political views is commonly viewed unfavorably. Personalized services risk amplifying social polarization, limiting awareness, and cementing pre-existing opinions (Perra and Rocha 2019), thereby supporting an environment conducive to disinformation and the emergence of damaging social structures (Brkan 2019).
Developing impactful media literacy models that consider socio-demographic and economic aspects of service personalization, population digitalization levels, attitudes towards personal data protection cynicism (Hoffmann et al. 2024), perspectives on algorithmic personalization tools, and related factors can bolster various analytical frameworks utilizing fuzzy methodologies, as supported by research findings.
Our research endeavor led to the creation of an innovative information-analytical system, meticulously designed to gauge the understanding of different social strata regarding personalized content within the digital media ecosystem. This system is a culmination of several key components:
-
4.
Development of an Information Model: We crafted an information model tailored for assessing the depth of population knowledge concerning personalized content. This model serves as the foundational layer, guiding the subsequent analysis phases.
-
5.
Fuzzy Method for Aggregate Knowledge Assessment: Leveraging fuzzy logic, we formulated a method to aggregate the knowledge levels regarding personalized content across the digital media landscape.
-
6.
Fuzzy Method for Social Class Knowledge Derivation: A specialized fuzzy method was also developed to extract insights into how different social classes comprehend personalized content. This facet of our system emphasizes the demographic and socio-economic disparities in digital media literacy.
-
7.
System Verification and Configuration: To ensure the robustness and applicability of our system, it was tested and fine-tuned using real data gathered from 1,213 respondents across the Czech Republic. This extensive dataset provided a solid basis for our system’s calibration and validation.
-
8.
Approbation Example: To illustrate the practical utility of our system, we conducted an approbation using a subset of data from 124 individuals in the Jihomoravský region. This exercise not only demonstrated the system’s operational capabilities but also highlighted its potential for region-specific media literacy interventions.
Through these concerted efforts, our work paves the way for a more informed understanding of personalized content consumption patterns across various demographics, setting the stage for targeted media literacy programs that address the unique needs of diverse population segments.
Our research is anchored in the sophisticated mathematical framework of fuzzy sets theory, knowledge-based intelligent analysis, expert evaluations, and multidimensional membership functions. This robust mathematical toolkit enables a sophisticated quantification and linguistic articulation of the populace’s understanding of personalized content, tailored to the social demographics of interest to Decision Makers (DM). Furthermore, the theory’s capacity to encapsulate the subjective insights of experts fortifies the decision-making process with a layer of informed intuition. By calibrating our information-analytical system with real respondent data, we have enhanced the precision of our findings, thereby delivering actionable insights of significant practical relevance to a broad spectrum of stakeholders, ranging from NGOs and public entities to governmental bodies.
The unique contribution of our information-analytical system is its comprehensive approach. It incorporates respondent feedback on their awareness of internet content personalization processes, their understanding of personalization techniques, and their perception of online content control levels. Moreover, it integrates the social demographics of the population into its assessments, employing intelligent knowledge analysis to evaluate the population’s grasp of personalized content within the digital media ecosystem. The outcome is a detailed portrait of knowledge levels about personalized content for specific social classes of interest to DMs. One of the system’s hallmarks is its openness to various groups of criteria and evaluation metrics, empowering other researchers to include additional assessment indicators and seamlessly adapt the system for use in other regions. This flexibility, combined with the system’s ability to aggregate individual responses into a collective understanding of personalized content in the digital media landscape, highlights its utility in bridging the gap between isolated opinions and communal knowledge across diverse regional contexts.
A limitation of our study pertains to the respondent sample for our research questionnaire, impacting the system’s verification process. The delineation of initial knowledge levels concerning personalized content amongst the population is contingent upon a system analyst’s interpretation of actual research data. Moreover, the potential ambiguity of outcomes is influenced by the selection of multidimensional membership function types and the formulation of characteristic functions. Despite these limitations, it is important to highlight that they do not compromise the reliability of our findings. The scientific results obtained and substantiated through our study robustly validate the research hypothesis, underscoring the efficacy and integrity of our investigative framework despite the outlined challenges.
The methodology presented ensures repeatability, allowing the same results to be obtained when experiments are replicated. This reliability is supported by the underlying mathematical theory and the experiments conducted. Additionally, the methodology is designed to be independent of the number of criteria used to evaluate citizens’ knowledge of personalized content and demographic characteristics, making it easily adaptable for researchers in different countries.
The research indicates that individuals with a strong interest in new technology, media, or marketing are often keen to learn more about content personalization to leverage this knowledge for their benefit. Professionals in media, marketing, or IT are particularly motivated to deepen their understanding of personalized content to stay competitive in their fields. Furthermore, awareness of the importance of personal data protection drives individuals to explore content personalization and control their online activities. Understanding the societal and political impact of content personalization can also inspire people to engage in discussions or activism on the topic.
Future studies will investigate into the determinants influencing the adoption of personalized news content, aiming not only to decipher the factors behind varying acceptance levels of individual techniques by users but also to gauge their perspectives on ethical considerations surrounding data collection and the ethics of consumer agencies. This inquiry, taking into account socio-demographic and socio-political contexts, seeks to pinpoint groups potentially at risk, thereby informing protective actions and aiding in the development of tailored media literacy enhancement programs. As media consumption escalates, the refinement and proliferation of personalization techniques are expected to advance. Associating multiple devices’ IP addresses to a single user opens new avenues for applying diverse personalization strategies and tools (Kaaniche et al. 2020). A deepened comprehension and awareness of the nuances of various personalization techniques among the populace will serve as a crucial foundation for their broader acceptance. Additionally, we consider the methods of the information and analytical system as algorithms to be implemented in innovative software, enabling practical application of the research by all interested parties. This implementation will facilitate the assessment of differences in digital media literacy across various social classes.
Conclusions
First, we developed a novel information model to evaluate how populations comprehend personalized content. This model introduced criteria groups concerning awareness of the content personalization process on the Internet, comprehension of personalization techniques, and the level of control individuals felt they had over online content. Secondly, we introduced an innovative fuzzy method for aggregating the collective knowledge about personalized content in the digital media ecosystem. This approach enabled us to derive aggregated knowledge assessments for the population within the studied region, offering insights into the general understanding of personalized content. Thirdly, a unique fuzzy method was created to ascertain the level of knowledge about personalized content across various social classes. By considering demographic characteristics to shape the social profile of the population and utilizing intellectual analysis of knowledge with multi-dimensional membership functions, we could articulate both normalized assessments and linguistic descriptions of knowledge levels across different social classes.
The practicality and efficacy of our information-analytical system were verified through rigorous testing with real data gathered from 1213 respondents in the Czech Republic. A probation example of calculation from the Jihomoravský region, involving data from 124 participants, experimentally validated the system’s efficiency and relevance, showcasing its applied research value.
Looking forward, we aim to understand the factors influencing the acceptance of personalized services and their mechanisms across varied demographic structures and social classes, focusing on individuals’ rights and trust in media service providers. A key research avenue will explore the relationship between the rate at which the population adopts personalized services and their perceptions of disinformation risks within the media ecosystem. The findings from this study promise to be instrumental for media literacy policymakers, public policy experts, regulators, and media institution managers and research teams. Our work demonstrates the critical need for systematically exploring the social class implications of personalization and algorithmic services, highlighting potential risks and the current gap in media and algorithmic literacy.
Responses