See our Sifter FAQ for more information.
There are now 60 scholarly mentions of the role DiscoverText plays in research you can look to for examples of social data collection and research. We find them on Google scholar and would be very grateful if you publish a conference paper, an article, or a book, please send us a note (email@example.com) with the Title and best URL to help other people find your work. Scholarly mentions of the tools and your methods help new DiscoverText users locate best practices and will lead to greater citation of your work in new literature.
We are also pleased to share a wonderful testimonial (http://bit.ly/1ykZydc) occassioned by the publication of “The Internet and European Integration: Pro- and Anti-EU Debates in Online News Media” by Asimina Michailidou, Hans-Jörg Trenz and Pieter de Wilde. Asimina writes:
DiscoverText made it possible for us to capture, map and deconstruct online political talk about the European Union found in a variety of online sources: Twitter, Facebook, political blogs, forums and online news platforms. The flexibility of DiscoverText is simply wonderful: A project involving 8 coders, sitting in 4 different European cities and working in many different languages could have easily turned into a methodological disaster, but DiscoverText enabled us to process all kinds of text from several countries in a very systematic and consistent manner. Stu has been very supportive throughout our project, always responding to our queries swiftly and efficiently. For us, there is no doubt that DiscoverText is an indispensable tool for online media research!
We love to hear success stories. If you have one to share, send it over and we will add you to the list. Finally, we have some leftover t-shirts from recent trade shows, including an AC/DC-inspired shirt from Big Data TechCon. We will mail you a shirt for the price of a Tweet. To get a free shirt, post something on Twitter about the academic uses of DiscoverText. Then, send us your postal address and a link to the Tweet (firstname.lastname@example.org) and we will mail you a shirt. Please indicate Mens or Womens and preferred t-shirt size. Thanks, ~Stu
To quickly learn the latest about gathering Twitter tweets for research, importing SurveyMonkey data, or using state of the art text analytics tools, check out our most recent Texifter tutorial videos. While they have a bit of a “home brew” flavor, we’ve been told they do help jump start the process of learning about exciting new tools for text.
Texifter Announces Strategic Partnership with SurveyMonkey
to Improve Survey Data Analytics
AMHERST, MA., May 27 2014—Texifter, a developer of social data and text analysis tools, today announced a new strategic partnership with SurveyMonkey, the world’s largest survey website, to provide advanced text analytics capabilities to SurveyMonkey users through its cloud-based platform, DiscoverText.
SurveyMonkey is known for intuitive interfaces and communications features that allow researchers to collect millions of survey responses every day. When surveys produce very large numbers of responses to open-ended questions, it can be a challenge to analyze all of the verbatim data. This is especially true for those relying on spreadsheet software as their primary text analytics tool.
DiscoverText provides an accessible “point and click” solution for these and other analytic challenges. Starting today, all DiscoverText users will be able to log in to SurveyMonkey to easily import existing survey data. Researchers can use a 30-day free trial to apply the full range of Discover Text’s powerful software tools to both the open ended answers and the structured survey metadata. Texifter’s “five pillars of text analytics” approach combines search, filtering, clustering, human-coding, and machine-learning.
Once registered on DiscoverText, newcomers have access to a wide spectrum of online data feeds. Facebook, Tumblr, YouTube, WordPress, Disqus, and Twitter data can be gathered, managed, and analyzed in DiscoverText alongside SurveyMonkey responses, email, and other forms of text data.
“The Texifter team is excited to be introducing SurveyMonkey users to the powerful and flexible text analytics tools in DiscoverText,” said founder and CEO Stuart Shulman. “We are confident that once people try out features like clustering and custom machine-learning, they’ll begin to see new possibilities for generating insights from bigger and more diverse collections of unstructured free text.”
This strategic partnership signals the latest phase in the evolution of DiscoverText. Originally built for federal agencies sorting large-scale public comment collections, the four-year old collaborative research platform now serves a wide variety of public and private sector clients, as well as the academic research community.
Texifter is a spin-out company based on information science research by Dr. Stuart W. Shulman, who directed the development of numerous human language tools for reviewing large numbers of public comments.
Document relevance is a key challenge for social media research. The specific problem of “word sense disambiguation” is widespread. If I am interested in “banks” where money is stored, I want to exclude mentions of river banks. If I am “Delta” airlines, I do not want to see social data about Delta faucets, Delta force, or those pesky river deltas. If I run a sports team like the Pittsburgh Penguins, the massive numbers of Facebook posts and Tweets about flightless but adorable birds are equally problematic. There are very few social media analytics projects that can easily avoid the challenge of sorting relevant and irrelevant documents.
At Texifter, we have refined a powerful set of tools and techniques for doing word sense disambiguation. This 5-minute video uses the example of Governor Chris Christie to illustrate how the five pillars of text analytics can help anyone to identify and remove irrelevant documents from an ambiguous social data collection. The principles are very similar to spam filtering in email; we use the same mathematics. Using DiscoverText, we argue an individual or small collaborative team can create a custom machine classifier for the task in just a few hours. Someday, we hope to get this down to a few minutes.
Just about six hours left to win valuable historical twitter datasets and powerful text analytics software. This is by far our best Facebook raffle yet. To enter:
The winner will get three 10-day historical Twitter datasets, with Power Track search operators enable by our friends @gnip as well as gratis use of the DiscoverText software platform. Runners up will also get valuable software prizes for a full year.
Just in time for the 2012 GOP convention, we are running a special offer to provide full Twitter fire hose access via the Gnip-enabled Power Track for Twitter:
Never miss a tweet. Full coverage with no rate limits. Powerful search rules, text analytics, clustering and machine-learning via custom machine classifiers.
This is the latest DiscoverText filtering feature designed to speed up the creation of accurate custom machine classifiers. This video shows how we use an interactive display of classifier scores to isolate items in a dataset that require further human coding to improve the accuracy of the classifier. Click on the screenshot below to start the video.