eBooks, Authors and Social Media Outreach

Welcome, and thanks for coming. Today, I will summarize key
findings of my dissertation research and present to
you the overarching purpose of the research I conducted,
the theoretical framework I developed that extends
existing theory and puts the research
into context, the specific questions
and hypothesis I developed that the research addresses,
the methodology I used to conduct the research,
the results of the research and interpretation
of those results, and the importance of this
work and its greater impact and directions for
future investigation. The book-publishing landscape
has changed dramatically during this first decade
of the 21st century. These changes are driven
by advances in technology, the evolution of
computer-mediated communication channels, and
marketing forces that continue to concentrate
mainstream publishing houses into mega-conglomerates,
while at the same time spawning thousands of
niche market players and perhaps hundreds of
thousands of self-publishers. Today people look not only to
the mass media for consumer information but also social
media where recommendations come from friends and family
or like-minded consumers. In the meantime,
traditional sources of reviews and advertisements
such as newspapers and magazines
continue to decline. Technology has made it easy
for authors to self-publish. And new business
models, such as print on demand and computer-based
authoring tools, have eliminated barriers
of printing, warehousing, and sales outlets. Ebooks, many of which
may never be printed, now account for 20% or more
of sales for major publishers. There is a
hyper-abundance of choice and a heterogeneous
marketplace comprised not only of bricks
and mortar stores but online marketplaces with
virtually unlimited choices. In 2011, at least 1.5
million new titles were published in
the United States alone, a 400% increase
in just five years. And the number of
self-published titles has eclipsed titles produced
by mainstream publishers. The full count is, in
fact, unknown and virtually unknowable since large
numbers of new titles never acquire an ISBN number. Only a portion of sales
of books, and especially of ebooks, are tracked by
companies such as Nielsen. And then only sales from major
outlets are typically counted. Only a small number of
titles are reviewed annually by the trade and popular
press, perhaps 25,000 or 30,000 at most. And opportunities for
previously common sources of book browse such as displays
and bricks-and-mortar stores are becoming less common. How do readers
find the books they want to read given the number
to choose from and navigate the diffuse and disparate
sources of information about them? Information about
books is moving online, and marketing for books
is moving online as well. According to marketing
surveys, friends and family are among the important sources
for book recommendations along with computer-mediated
social channels, advertising in various
forms of search and automated
recommendation systems. Many of these
interactions occur today within online social networks. Traditional mass
media marketing is out of the price range of all
but the likely bestsellers sellers from major
publishing conglomerates. Authors today, self-published
and traditionally published alike, are
proactively taking action in order to connect
readers to their books. And even traditional
publishers with media budgets expect authors to shoulder
some of the responsibility for getting their books
into the hands of readers. The research I’m going
to describe to you today looks at how authors
use social media and how social media can
be used as a strategy to connect readers to books. In particular, the
research looks generally at samples of current
literary output and the kinds of
social media outreach the authors of those
works are using to connect potential readers if any. This research
provides a first look in estimation of the relative
impact of those strategies on discovery and
readership and lays a foundation of understanding
for how these strategies might work. The name I’ve given to the
framework I’ve developed is social gatekeeping. And I’ll start by describing, at
least briefly, its foundations and theories of communication,
marketing, social network theory and dyadic communication,
and library and information science. Social gatekeeping
extends these theories to support my research. Point 1, I’ll first mention the
traditional publishing chain as described by book industry
expert, John B. Thompson. This view of publishing
draws from the management and marketing literature on
commercial supply and value chains. The publishing chain
consists of the succession of intermediaries who act to
select, filter, and add value to a literary work as
it moves from the author to the publisher to the market
and, finally, to the reader. So we start with the author
who creates the content and follow it along
to the publisher where various
intermediaries add value, such as copy editing, design,
typesetting, proofreading, printing and binding and sales
and marketing, warehousing and distribution, and
finally to booksellers who get the work into
the hands of readers. Through the end of
the 20th century, this supply chain
is the principal way books made their way
from author to publisher. There were few alternatives. There was so-called
vanity self-publishing. But vanity publishing lacked
access to the publishing chain. And very few books published
that way found readership. As traditionally conceived
and illustrated by Thompson, the publishing supply
chain relies principally on mass media,
including both marketing and reviews, to initiate
information flow and promote discovery and sales. Point 2, gatekeeping was first
proposed by Kurt Lewin in 1947 and is initially presented in
vision channels through which food came to the table,
such as from the garden or from the market. Gatekeepers of the
decision makers at key points in the channels
called information gates where critical decisions
about what might pass through the gate are made. Lewin found that the
key to influencing what made it to the table was
to influence the gatekeeper. And gatekeeping has since been
adopted by many disciplines as a theory,
framework, or model. In the field of journalism,
gatekeeping by editors is posited as the mechanism
by which millions of messages are filtered, modified, and
transformed into the few that make their way to readers. This is primarily an act of
information filtered out. Library science also recognizes
gatekeeping in the literature but in a different context. Librarians and other
information professionals intermediate information
and curate it by finding the best work
and making it available. In this sense, gatekeeping is
a positive force for discovery and knowledge transfer. And this is primarily an act
of information filtered in. So while rejected authors
often think otherwise, gatekeeping is not
necessarily bad. Gatekeeping serves to
filter out poor quality while promoting the best
work by the best authors made better by the services
of experts in the publishing chain, such as
editors and artists. Point 3, once information
reaches someone within a social network, say
through mass media advertising, social network theories
and communication theories describe how
information diffuses through those networks. I’ll mention the two-step
flow of communication theories developed by Paul Lazarsfeld in
1944 and subsequently by Elihu Katz beginning in 1955. These theories were proposed
as a counter to early theories about mass media that
posited that mass media had direct and powerful
effects on individuals. The two-step and
multi-step flow theories suggested instead that mass
media reached certain people who were then responsible
for influencing others in their social circles. I’ll also mention
Everett Rogers’ diffusion of innovations work
first presented in 1962 with several
additions since then. Rogers writes that the first two
phases of innovation, adoption or knowledge, which is
discovery, and persuasion, the influence to adopt. Rogers writes that, quote,
“diffusion and adoption gatekeeping is controlling
the flow of messages through a communication channel. One of the most
crucial decisions in the entire innovation
development process is the decision
to begin diffusing an innovation to potential
adopters,” end quote. Point 4, Mark
Granovetter’s 1977 article, The Strength of Weak Ties
offered an explanation of how information traverses
social networks by positing that new information often comes
to an individual from those in one’s network who
are socially distant. Granovetter noted that, while
close friends tend to be homophilic– that is, alike in
things like education, taste, opinions, and views– close friends aren’t a good
source for new information because there is a good
chance the information is known already. And while homophily and
strong-tie relationships are important factors
in persuasion, it is the weak tie who
hold information not known to the strong-tie individuals. The weak-tie effect
on sharing behavior on Facebook and other
networks has been experimentally demonstrated. And another factor
is that individuals tend to have more weak
ties than strong ties. Point 5, finally
from library science, we have theories
and explanations for information-seeking
behaviors. And two of them often compared
are search and browse. Searching– that is, looking
to find something specific– is more often than not focused,
convergent, goal-oriented, and systematic. Browse, on the other hand,
is less focused and used when specific information
needs are not yet fully defined or understood. Browse then, compared to
search, is divergent, dynamic, and undirected. Successful browse
depends on the seeker to recognize and make
new associations that serve individual
needs for information. In the library literature, a
term often applied to browse is serendipity and
unsought, unintended, and/or unexpected
discovery and/or learning experience that happens
by accident and sagacity. Point 6, if a person knows
the author and title of a book or has a very focused idea
of what the book needs to be about, then
conventional search strategies are often successful. But other strategies
may be employed. A person may ask
a friend or family member for a recommendation
or may visit a social website where readers hang out to browse
favorite lists or author pages or may turn to blogs
or other venues. These are all cases
where an individual may find to connect with
people not in their networks even weekly. Computer-mediated
communication channels make it relatively easy to find
previously unknown individuals who may, through their
posts and messages, trigger an information transfer
to the browser, who in turn may create forward or otherwise
move the information across social networks where
no prior measurable tie exists. I’ve called this phenomenon
the serendipitous tie, which is an incidental
chance or accidental, interpersonal
relationship event that may occur between people
not otherwise socially connected by means of which
information may be passed and communicated from one
individual and potentially one social network to another
individual and social network. This slide shows a
Venn diagram view of the classical view
of information diffusion between networks via a
shared weak-tie relationship. Here, weak ties serve to
bridge information flow between social networks. This slide shows a view
of the way information penetrates a social
network either through serendipitous ties in
which information is exchanged between people who
do not necessarily have weak-tie affiliations and
also directly from mass media influence. Network researchers are
aware that information comes to a network
from external sources. But this is something they
often control for and research. The classical view
of weak-tie influence was confirmed in a study
by [? Bakshi ?] and others that looked at over 200 million
wall posts across Facebook’s entire platform
in order to study sharing patterns of strong and
weak-tie relationships, which by the way convincingly
demonstrated Granovetter’s weak-tie hypothesis. In this study as a
method of control, they actually blocked
some shared posts in order to determine
a range of limits to the amount of information
shared by individuals who did not get the information
from strong or weak-tie Facebook connections. But I think that’s an
important construct. And the serendipitous
tie, as I’ve called it, is actually a pretty
interesting phenomenon, in part, because it appears that
the serendipitous exchange of information emulates mass
media effects as described by the two-step diffusion
of innovation theories I’ve mentioned. So to summarize,
social gatekeeping can be initially defined as the
process of finding, selecting, filtering, and
shaping information about a product, service, or
idea and making it available, or not, as a message accessible
in a social communication channel. The message is the
unit of analysis identified by a URL or some
other kind of identifier. Further, the more messages
there are that are shared, the greater the web presence
of an author or book. Once these messages
are out there, not only can people find them,
but applications and processes can find them. And these often form the basis
of recommendation engines which analyze, pool, and extract
social data as a marketing technology. So social gatekeeping
can also be mediated by machine processes. And machine processes can
connect people and their data serendipitously even without
their explicit knowledge. Now, let’s take a new look
at the traditional publishing chain. The traditional publishing
chain is the sequence of gatekeeping decisions through
which a book and its metadata progresses from author to agent
to publisher to distributor and finally retailer. At each step, a
gatekeeper makes decisions about aspects of the book
and adds value to it. In the traditional
publishing model of the 20th century
shown here on slide 10, the publisher stands as the
primary gatekeeper directing the flows of the book and
information about the book, the book metadata, as it works
its way to the retail channel. While the physical
book is working its way through editing, design,
printing, and distribution channels shown here in green,
information or metadata about the book is made
available through mass media advertising and marketing,
through a mass media network of reviewers and critics. And that flow is shown in red. The reading public
might find it directly. For example, a reader finds
the review or sees the ad or finds the book in a library
or on a promotional table inside a bookstore. Or the reader might
find it indirectly from a person of influence,
such as a friend or respected acquaintance who has
learned of the particulars from the mass media. Then word of mouth
spreads the information through social networks. Traditional
publishers have always relied on social networks and
the word of mouth diffusion to reach readers. For mainstream publishers
of the 20th century, the diffusion of
information began primarily through mass media
with a smattering of direct to consumer
marketing and direct marketing to persons of influence,
such as book club leaders. However in the 21st century
as shown here on slide 11, computer-mediated communication
facilitates direct discovery of books through online
venues that bypass mass media gatekeepers and
provide new mechanisms for the flow of book metadata. This includes discovery
at point of sale, such as online
markets, and discovery through computer-mediated
channels, such as websites, blogs,
social network fan pages, and other online
social information constructs. The expanded view of
the publishing chain is as it exists at the
beginning of the 21st century. It shows that authors
and publishers no longer are constrained by the
traditional publishing and mass media gatekeepers. Mainstream traditional
gatekeepers still exist, and many readers trust them to
provide high-quality reading experiences by virtue of
the gatekeeping process. But now, technology,
computer-mediated communication channels, and
social networks have made it possible for individuals
to act as gatekeepers to their friends
and acquaintances as well as to the
web-browsing public. For the author or publisher,
social gatekeeping may be a strategy
that supplements, and, in some cases
supplants altogether, the role of mass media in
making information about a book visible and which triggers
the diffusion of information about the book through
social networks and generally through the web. Authors have many
potential pathways to enhance discovery
ability for readers. The principal organizing
research questions posed for this research
are, how and to what extent do authors connect to
readers through social media? And what is the extent to
which such use increases discoverability and readership? This is an important
and fundamental question that needs to be addressed
as the first step in testing the social
gatekeeping framework. While this alone is insufficient
to establish social gatekeeping as a robust gatekeeping
theory extension, a negative result would
serve to cast doubt on the framework’s viability. The research focuses on
ebooks as an emerging form of literary production
now 20% or more of sales even among major
conglomerate publishers and which, by their very nature,
can only be acquired and read through some form of
computer-mediated communication medium. This makes them ideal
candidates for a study on the use of computer-mediated
communication by authors. And further, while many
ebooks have print versions, many are digital only. Selecting ebooks as
the object of research provides an opportunity to
compare digital-only versions with the digital-plus
print versions to see if there
are any differences in the impact of author web
presence on discoverability and sales. Point 2, a limitation
of the study is that the results
and interpretation are only strictly generalizable
to books released and sold by Amazon, which was the
focus of this research. Amazon accounts for the majority
of sales in the book market at an estimated
60%, or even more, according to some analysts. It’s true that there are
an unknown number of books released by and
available for sale at outlets other than
Amazon that might generate different results. Limiting the study
to Amazon data is primarily a result
of the difficulty in generating a true random
sample selection of titles from other sources. Of several potential
sources reviewed, only Amazon provided both
the search-browse function that could return a
complete population of books in a non-biased return order
and robust computer-based access to the ebooks’
internal metadata. At this point in time, Amazon
is the best operational choice to study the current
universe of ebooks. The situation will undoubtedly
change, perhaps sooner rather than later, as other players
become stronger and as ebooks evolve. When that happens, the
results from this Amazon study will provide a baseline from
which to observe changes. Point 3. So the research is based on a
random sample of ebooks drawn from the total population of
8,000-or-so ebooks released on Amazon between March 31
and April 5, 2012, inclusive. The use of a true random sample
drawn from a total population is a standard assumption
of many statistical tests, including regression, which was
the primary statistical tool used for the
quantitative analysis that I’ll discuss shortly. Point 4, I also generated a
list of the most popular ebooks from lists of the
bestselling paid and most downloaded free
ebooks on April 6, 2012. The second sample was
used to provide a look at successful ebooks and compare
these exemplars with books from the random sample. There are some
interesting differences between the two
samples that you’ll see when I present
some of the results. Note though that the popular
sample is not a random sample nor is it intended or
used as a control group. It was collected and
tracked along with, but separately from, the
random sample data set. In order to provide
exemplars that could be used to compare aspects
of popular ebooks and authors with ebooks and authors
in the random sample. Unlike the random sample, which
consists of ebooks released during one short period
of time, publication dates of the popular sample
were not date-limited and so may have been in the
market for months or years and undergone previous
cycles of popularity. Further, author web
presence may have changed over the
course of the release where the title
was on the market for an extended period of time. These factors should be taken
into account therefore when interpreting the results
I’ll present shortly. Point 5, data about the ebooks
were tracked and collected weekly for 15 weeks
throughout the summer of 2012 using custom software in
largely automated methods. While that was going
on, information about publishers and the
author’s use of social media along with other
descriptive data was collected manually
through search techniques. The diffusion of
information about a book can be measured a
few different ways. Search engines can be used
to perform specific queries about books. An account of returns– that is, the search
engine hit count– will reflect book web presence
if the query has high recall and precision. Sales can be directly
compared if known or inferred from estimates derived
from Amazon sales rank. Presence may also be reflected
in counts of reader reviews. Although it may not
be possible to count every instance of relevant
web page’s sales or reviews, sampling them consistently
and without bias provides a means of comparing
titles and estimating the extent to which
certain factors might predict greater diffusion. Slides 14 and 15
show the main data elements collected for analysis. The dependent variables designed
to measure book web presence includes search engine
queries on ASIN, which is the Amazon
Stock Identification Number, assigned to each
ebook and also searches on a specific quoted
author-title phrase, Amazon sales rank, Amazon review
ratings and counts, and offer price along with a
few other miscellaneous data points. Independent variables
are examined to determine
whether they predict differences in the
measure-dependent variables. Six methods of
social media outreach were collected as the
independent variables for author web presence. An Amazon author
page, which an author may claim and post biographical
information and links to other social sites. A Goodreads author page, which
similarly provides a mechanism for authors to connect to
readers on that social network. A Facebook page, a Twitter
account, a website, and a blog. These were selected
for review based on an earlier review of selected
ebook as being representative of the social media
use by authors. The research naturally
fell into three phases, which are interrelated and
that, in total, provide evidence consistent with
social gatekeeping from multiple perspectives. Phase 1 included the
collection, disaggregation, and classification of
the data collected. Statistics generated
from phase 1 include totals and subtotals
plus some direct counts of sales that can be
grouped by social media use and other categories such
as mainstream-published or self-published. Because not all books in
the random or popular sample were suitable for analysis
of author’s social media outreach– for example,
out-of-copyright classics, magazines released as ebooks,
and some other categories– the classification
effort was also used to identify the subset of
approximately 325 ebooks used for analysis in phase 2. Phase 2 of the
research consisted of analysis of author web
presence and social media participation as independent
predictor variables and book web presence and
sales as dependent variables. Multiple regression
was the primary tool used to determine the degree
to which independent variables could predict variance in
the dependent variables. The social gatekeeping
framework suggests that author web presence and
participation in social media should associate positively
with book web presence in sales. The multiple
regression analysis was used to show which
specific author web presence and social
media activities might best predict sales
and discoverability. The purpose of the
phase 3 research was to conduct a review of
selected group of titles from the random
and popular samples in order to gain additional
insight on how authors use social media, how
authors may be leveraging the serendipitous tie, and
what such a review might suggest for future research. In all, about 35
titles were chosen based on observations made
on initial review during data collection and also
on results of the data collection such as the
book’s ending data collection with the highest sales rank. The research
questions for phase 1 are focused on
sample description. I’ve also presented some of the
data summarized in table format on the next slide. This slide shows
some of the results I found particularly interesting. Republication of
public domain books continues to represent
a significant portion of title output, which reprises
a 2008 study on which I was a team member on
self-published books as well as industry figures. Self-published book counts,
73% of the total number of current author titles, exceed
even [INAUDIBLE] estimates possibly due to the number of
self-published titles on Amazon without an ISBN. Most books, and especially
self-published books, don’t sell very well or at
all, at least initially. Large numbers of
ebooks are published without print equivalents. There was an unexpected absence
of enhanced ebooks and ebook applications. Because these are highly
promoted on the Apple platform, this may be a platform issue. But we don’t have
a good idea even on Apple of the numbers of
enhanced ebooks compared to the total number of ebooks. So this bares further
investigation. In the random sample but
not the popular sample, I found short works– that
is, 1 to 20 pages or so– most self-published. Some industry analysts
think this is an emerging form of literary output. This confirms that
at least to a degree. And it bears further research. There are dramatic
differences in no sales, books offered for sale that
reported no Amazon sales during the data-collection window,
between authors who used social media and those who didn’t. Authors who didn’t
use social media had almost doubled
the number of no sales as a percentage of
total as authors who used at least one form
of social media outreach. See the chart, sales is a
function of social media use, coming up on slide 19. This is strongly consistent
with the predictions of social gatekeeping theory. So on this slide, I’ve
presented some selected descriptive statistics. You may want to pause
playback if you’d like to study this in more detail. And here are the no
sales and social media use statistics that I
mentioned just a minute ago. And again, if you want to
study this in more detail, you can pause the recording. Now we move on to phase 2. As a statistical tool,
regression is closely related to correlation, and
the background math is very similar. But where a correlation
compares two variables to see correspondence
between them, regression generally posits
one or more variables as independent and one
variable as dependent in order to see
whether, and to what extent, a change in the
independent variable might predict change in
the dependent variable. So regression tells us
about the relationship between several independent,
or predictor, variables and a dependent, or
criterion, variable. For this research,
the dependent variable is one of the measurements of
book web presence or sales, such as the search
engine hit count or sales as estimated from Sales Rank. The independent variables
are the six categories of social media use by authors. And they’re tested as a
group or model against each of the dependent variables. The purpose was to see whether,
and to what extent, author web presence and social media
outreach might predict discoverability and sales. Multiple regression is one
of the most commonly used statistical tools. But that means that
it’s also probably one of the most commonly
misused statistical tools. The common pitfalls
include assuming that, because it tests the
relationship of variables set up as independent
and dependent, that it establishes causality. And that’s simply not the case. Also there are
certain conditions, some listed on this slide,
where the regression may give misleading results. For example, if
the regression is tuned by testing
various combinations of independent
variables in order to achieve maximum
effect size, one variable may be given undue
weight by leaving out a correlated variable. That’s called emitted
variable bias. And there are some
other subtle traps. And finally, regression is
not an experimental method, which is considered
the gold standard for hypothesis testing. This slide shows
the key statistics you need to understand in
order to follow regression. And I’ll explain what
they mean and put them into context on the
next slide when you see the results of my research. R-squared is also called the
coefficient of determination. It represents effect size
or the percent of variance observed in the dependent
variable predicted by the independent variables. Model significance is
measured by F statistic and p value, which together indicate
the probability of obtaining a test statistic, at least
as extreme as the one that was actually observed, assuming
the null hypothesis is true. In this case, results were
considered significant if these odds were
no greater than 5%. Standardized beta, or
standardized coefficient, is the influence
calculated for each of the individual
predictor variables on the dependent
variable, assuming the remainder of the independent
variables are held constant, expressed in terms of
standard deviations. Each beta is also individually
tested for significance. And only significant betas
are usually reported. Regression indicators
apply to the whole model, including all the
predictor variables tested. Betas for individual
predictors may change as the model changes– that is, if predictor
variables are added to or deleted from the model. The research questions
all involved determining the various relationships
between social media outreach and aspects of
discoverability and sales. Social gatekeeping
predicts that there will be a positive association
between social media outreach, on the one
hand, and discoverability and sales on the other. And these are formally
stated as hypotheses. Simply stated,
the research tests whether authors who
use social media outreach enjoy greater
discoverability and sales and, further, whether some
kinds of social outreach may be more effective
than others. There’s also a research
question and hypothesis related to reviews. In all, 14 regressions
related to the main hypothesis were computed. And all were significant. The only model not reported for
the random and popular sample used Google search engine hits
of the Amazon stock numbers found on blog sites. And the hit counts were
just too low to provide reliable statistics. The 12 remaining
shown here were all significant with a
probability of less than 5% that these results could have
occurred simply by chance. So here on slide 22 is the
table for the random sample. And as you can see,
the R-squared values are in the low to moderate
range accounting from 18% to nearly 40% of the
variance observed in each of the dependent variables. Simply stated, that means that
the social outreach accounted for between 18% and 40%
of the difference observed in discoverability,
sales, and reviews. We might expect this since
the model doesn’t include other things authors
and publishers can do to increase
discovery in sales, such as mass media advertising. So I consider this a good
result. As you can see, a Goodreads author page
had the best predictability of all the predictors. The way you might read these if
you look at the Amazon reviews column over on the far
right is that for each one standard deviation in
the increase in author participation on Goodreads, you
might expect a 0.38 standard deviation increase in
the Amazon hit count. So that’s how you need
to look at these data. Amazon recently
bought Goodreads. And this number
might explain why. I think it’s interesting that
none of the other predictors were significant, including
Twitter, a blog, or a web page. In contrast now, looking at the
results for the popular sample, we see a shift. Goodreads is no longer
much of a factor. But Facebook takes every
dependent variable. And web page ticks four of them. Overall, the effect size–
that is, the R-squared values– are lower as are the
betas overall compared to the random sample. One thing you don’t want to
do with a regression analysis is to throw a lot of
independent variables at it. The more independent
variables you include, the more cases you need to be
comfortable with the numbers. When I designed the
study, however, there are some other things
I was curious about. So I ran selected independent
variables against sales to see what the
numbers look like. I was not surprised
to see that having a print version available
might be associated with increased sales. But I was surprised
to find out that, when you look at social media use
not as a dichotomized yes or no variable but actually look
at the numbers of interactions, only number of tweets
was significant. And number of Twitter followers
or number of Facebook friends was not. You might think that
more actions would predict greater hits and sales. But that wasn’t the case here. Also not significant was
whether an author had more than one available book. These are definitely
areas for further study. So what do these results mean? First of all, statistical tests
confirm the basic hypothesis predicting a positive
association between author social media outreach and book
discoverability and sales. There was also a
positive association between reader social
reviews and ebook sales. The effect is low
to moderate, which I believe is to be expected. The emergence of Goodreads as
the most impactful predictor in the random sample supports
the social gatekeeping framework. Goodreads consists of
approximately 17 million avid readers who connect
and post about books with 23 million reader reviews. It has the highest concentration
of individuals posting and sharing
information about books of all the social media tested. So it’s not surprising that, of
all the independent variables associated with
social gatekeeping, Goodreads is the highest. Facebook is also about
friend and family sharing more so than
blogs or websites. Different predictors emerge
for the popular sample along with a lower effect size,
namely Facebook and web pages. So Facebook and the web may
play a more important role in driving sales
once a title has come to the attention of a reader. This would be consistent
with social gatekeeping and the theories on
multistage processes of diffusion and adoption. And finally, some caveats. As I mentioned before, causality
is not established here. The regression values probably
indicate relative importance of, but not absolute
values of, predictability. The regressions test social
media outreach by authors but not social media generally. That is, although
Twitter, for example, wasn’t a significant
predictor here, it doesn’t mean that Twitter may
not be an important discovery tool, just not one
used by authors. The research
questions for phase 3 were concerned with
taking a qualitative look at some selected
authors and titles for confirmation of the
selection of variables, for additional insight
into the results, and as inspiration
for future research. In addition, I wanted to
look for evidence of or ways to get more explicitly
at the serendipitous tie since it’s not really
tested by either the phase 1 or phase 2 analyses. The titles were selected
on the basis of notes I made during phase 1
classification of anything I saw that I judged was
either confirmatory or new and interesting. In all, about 35 titles from
both samples were scrutinized. And the sites were
revisited nearly a year after the initial data
collection to observe changes. I would summarize the
most important impressions as follows. There is considerable variation
in how social media was used by authors, with some using
it in rich and complex ways and others using it sparsely. This raises questions of how
to maximize social media effect and how to evaluate
both qualitatively and quantitatively the
impact of social media use on discovery and sales. [INAUDIBLE] generally
confirms the selection of dependent variables as
appropriate and reasonably complete. Only a few authors
are experimenting with alternatives, such
as Pinterest, StumbleUpon, YouTube, and LinkedIn,
although these numbers appear to be increasing. I was surprised to see
that links to LibraryThing were relatively infrequent,
even though LibraryThing is a popular book-centric
social network. There was a rather
dramatic apparent increase over time in social
site-sharing widgets, which can be used by authors to
track suring patterns by fans. This is the feature
most indicative of the serendipitous
tie, and it shows that authors understand
the importance of encouraging sharing behavior
by visitors to their sites. This suggests that
future research on the serendipitous tie might
be most fruitfully examined using link-tracking techniques
since information shared without explicitly
observable tie status leaves few traces otherwise. Authors from the popular
sample are active participants in social media, much
more so than authors from the random sample,
which perhaps may be an effect as well as a cause. And finally, some
mainstream publishers expect authors to
come to them with well-developed social strategies
as a condition of publication. This suggests that
publishing experts recognize the importance of social media
as a catalyst for the diffusion of information about
books through the reading public via social networks. Reading and the cultural
production of literary works are at risk without
an effective way of connecting readers to books. As the nature of the
book itself changes as text migrates to
digital form and authors increasingly seek nontraditional
paths to reader discovery and reception,
traditional gatekeepers and their mass media
represent only a small number of channels through which
readers come to discover books. This research informs
key stakeholders in the business and art of
book culture of the changing nature of the reader-author
connection, the emerging role of the author in
connecting books with readers, and the role of social networks
and facilitating discovery and retrieval. It also progresses
gatekeeping theory to accommodate new social
network conceptualizations and lays a necessary
foundation for ongoing research into the study of the
emerging digital book market in coming years. This research did not
use experimental methods such as randomized field trials. Rather, these results
provide the initial empirical foundations of support for
extending gatekeeping theory to include social gatekeeping
as an important construct that should be further
explored and validated using more costly and
complex experimental methods. As such, it establishes
a research agenda that can be progressed using
increasingly sophisticated methods and tools. One of the objectives
of this research was to capture a
snapshot of titles that could be
examined and compared with other snapshots
taken over time. There is considerably
more information that could be mined
from the data collected for this research that is beyond
the scope of this research. So there is more work to be
done, even on these samples and the data already collected. High on the list of
research questions that might be proposed
for future research include reviewing libraries
as a source of discovery and information diffusion– that is, their potential role
as an independent variable, predicting web
diffusion and sales and also their role as
a dependent variable measuring web diffusion. Libraries and the
role libraries play in both bibliographic
control of digital titles and the degree and
manner to which they make digital titles available
is an evolving issue that could be informed
by empirical research. The current involvement
of libraries with ebooks is very unsettled. As this study is concluding,
Amazon purchased Goodreads, and it previously
acquired an interest in Shelfari and
LibraryThing, also book-centered social networks. What the outcome
of this will be is unknown at the time
of this writing. But it’s clear that there is
keen interest and awareness by commercial interest in
social networks in general. For the book trade
in particular, the growing importance
of social networks is a catalyst for reading. And the ongoing
evolution of the book, in both print and
digital formats, should prompt continued
scholarly examination of the role of
social networks and the social gatekeeping
framework plight in connecting readers to books. More generally, the social
gatekeeping framework and the role of the
serendipitous tie in propagation of
information through networks should be explored in other
contexts to determine how, to what extent, and
in what ways they are generalizable
to other disciplines and fields of study. Thank you very
much for listening.

Leave a Reply

Your email address will not be published. Required fields are marked *