Content analysis

From HLWIKI Canada
Jump to: navigation, search
Nine (9) major steps in content analysis method
Source: Wikicommons
Are you interested in contributing to HLWIKI International? contact:

To browse other articles on a range of HSL topics, see the A-Z index.


Last Update

  • Updated.jpg This entry is out of date, and will not be updated, August 2018


See also Big data | Grounded theory | Research Portal for Academic Librarians | Text-mining

"...Content analysis is a research method used for making replicable and valid inferences from data to their context, with the purpose of providing knowledge, a representation of facts , new insights, and a practical guide to action." — Krippendorff, 1980

Content analysis (also textual analysis and even grounded theory) is a research method that can be used to examine (and quantify objectively) the presence of certain words, concepts, themes, phrases, characters, and sentences within a text or sets of texts. Texts may be defined as articles, books, chapters in books, interviews, discussions, historical documents, speeches, conversations, e-mail, or really any occurrence of communicative language. Content analysis is a commonly-used technique to examine data generated by social media technologies such as blogs, wikis and Twitter. Content analysis enables researchers to sift through large volumes of data with relative ease and in a systematic way. Krippendorff says that content analysis research is motivated by the search for techniques to infer what is too costly or too obtrusive to be accomplished with datasets using other techniques.

To conduct a content analysis, texts are coded, or broken down, into manageable categories on several levels: words, word sense, phrases, sentences, themes, and then examined using an established analytical method. Results are used to make inferences about the text(s), writer(s), audience and culture. Content analysis can indicate pertinent features such as comprehensiveness of coverage, bias, prejudice and author oversight as well as all other persons responsible for the content. In content analysis, there are two approaches to coding: emergent coding seeks to develop categories following preliminary examination of data. In a priori coding, categories are established prior to the analysis based upon a theory. Professional colleagues agree on the categories, and the coding is applied to the data. Revisions are made as necessary, and categories are refined to maximize mutual exclusivity and completeness.

In 1931, Lindesmith used content analysis to refute an existing hypothesis. The method was frequently referred to as grounded theory until the 1960s. Its purpose was to examine the frequency of keywords in texts to determine the most important structures of the writing in question. Today, content analysis is frequently used in all kinds of research to determine the most important aspects contained within texts. Establishing reliability is easy and straightforward in content analysis. Of all existing methods, CA scores highest with respect to ease of replication.

In 2006, Robinson noted that content analysis is an alternative technique in library and information science (LIS) research, but is too often ignored. She outlines the basic concepts in content analysis, and explores the possible reasons why it has had limited application in the LIS field.

Six questions addressed by content analysis

According to Krippendorff (1980 and 2004), six questions must be addressed in every content analysis:

  1. Which data has been analyzed by the content analysis?
  2. How is the data defined?
  3. What is the population from which the data is drawn?
  4. What is the context relative to how the data are analyzed?
  5. What are the parameters or boundaries of the analysis?
  6. What is the target of the inferences?

Ten (10) steps of content analysis

  1. Read entire transcript; make notes in margins when interesting or relevant information is found
  2. Go through notes in the margins and list different types of information found
  3. Categorize each item, and description of what it is about
  4. Identify whether categories can be linked; list them as major categories (or themes) and / or minor categories (or themes)
  5. Compare and contrast major and minor categories
  6. If there is more than one transcript, repeat the first five stages again
  7. Aggregate categories and themes; examine each in detail and consider if it fits and its relevance
  8. Categorize data into major categories/themes, review to ensure information is categorised accurately
  9. Review categories and whether categories can be merged or if some need to be sub-categorised
  10. Return to transcripts and ensure information needs to be categorize

The process of content analysis is lengthy and may require the researcher to go over and over the data to ensure they have done a thorough job.

What to look for in CA?

  • The researcher should give a clear description of the context, selection and characteristic of participants, data collection and process of analysis
  • The content analysis may comprise a conceptual analysis or relational analysis
  • content analysis has most often been thought of in terms of conceptual analysis; in conceptual analysis, a concept is chosen for examination, and the analysis involves quantifying its presence (also known as thematic analysis)
  • relational analysis, like conceptual analysis, begins with the act of identifying concepts present in a given text or set of texts; relational analysis however goes beyond by exploring the relationships between the concepts identified
  • A good content analysis is one where the researcher analyzes and simplifies data from categories that reflect the subject of study in a reliable manner
  • The categories should cover the data completely; it may be necessary to demonstrate the links between the data and results
  • Including appendices and tables may be useful to present the links and the results visually
  • Downe-Wamboldt describes content analysis as a research technique that provides systematic and objective means in order to describe and quantify phenomena; content analysis is more than a counting game; it is concerned with meanings, intentions, consequences and context.
  • As with other research methodologies, CA requires consideration of bias and in making assertions about the data, its meaning and and generalizability. It is important to reiterate the need for the training of coders and the assessment of reliability and validity
  • Human coders are used in content analysis. Neuendorf suggests that when coders are used in content analysis two coders should be used. Reliability of human coding is often measured using a statistical measure of intercoder reliability or "the amount of agreement or correspondence among two or more coders" (Neuendorf, 2002).

Key websites & video


Personal tools