Thresher'S QuickCode

Because machine learning can't happen without labeled data.

WHY QuickCode


Quickly and clearly label the training data you need to teach and tune your machine-learning models. QuickCode is a label recommender built with patented methodologies that allow your experts to quickly prepare training data for machine learning and other modeling from your text. We’ve helped some of the largest corporations, federal agencies, and universities. How can we help you?

Control your data.

QuickCode is designed to work on your cloud with your data under your rules, allowing you to safeguard sensitive, proprietary, or regulated information.

Leverage your experts.

Your experts’ time is valuable. You don’t want to take them offline for a week to label data. Paired with QuickCode, experts can label thousands of documents in less than an hour.

Iterate quickly.

Decision makers need models built on today’s data, not yesterday’s. QuickCode gives your data science team a way to get fresh training data quickly to tune their models. And they won’t need to tie up your subject-matter experts each time they need to make an update.

Keep it simple.

Our platform is powered by sophisticated methods, built on years of research conducted by our team of data scientists and engineers. We offer an intuitive experience so you can focus on applying your expertise, fast.

quickcode features

  • We integrate with your existing workflow.

  • We make it easy to get your text in and your labels out.

  • We work on short and long documents and everything in between.

  • We can handle humor, rhetoric, slang, code, and any terms unique to your industry.

  • We can recommend labels in any language, with expertise in Spanish, Chinese, Arabic, and Russian.

  • We can host on our cloud or install on yours.

WHO WE HELP

Challenge

How do you quickly turn texts into labels for your machine learning?

Your manager and clients want you to use machine learning to predict an outcome. You know how to tackle the structured data you have, but what to do with the column of text data, such as customer comments or technician notes? What if the outcome itself needs to be coded up based on the text? You need to quickly and accurately code documents into categories but you don’t have the time and resources to read and categorize thousands of documents.

Solution

QuickCode will help you classify documents quickly and accurately by suggesting keywords to build precise queries.

Example

A data scientist used QuickCode to create a query in less than 15 minutes that classified more than 5,500 SMS messages as spam or not spam. This classifier had 95% accuracy relative to human coders. Furthermore, the posts classified by the QuickCode-built query helped train a machine-learning model to predict future spam texts. The model had similar performance to a one trained with human-coded labels, but was trained in a fraction of the time. The data scientist found it valuable because:

1) They could use the query to explain to their managers what words and numbers were commonly found in spam text.
2) When spammers changed their patterns, the data scientist iterated again with QuickCode to update the query and prediction models, which helped the team stay abreast of changes in spam tactics.

Challenge

How do you quickly search and label vast quantities of data for new insights into healthcare, while at the same time explaining your labeling decisions to others?

You’ve identified a factor that could dramatically affect your patients’ well-being and have thousands of records that might help test your hypothesis.

But there are too many documents to review and label on your own. Even if you could, it would take time to explain your categorization system to others or change it as new hypotheses emerge.

Solution

QuickCode helps you identify healthcare documents relevant to your interests quickly and accurately, and explain your data labeling approach to peers.

Example

A data scientist used QuickCode to rapidly identify a subset of medical patients and transparently explain their labeling process. The data scientist began with a hypothesis that familial or social support improved patient outcomes. They then searched 50,000 discharge summaries made by healthcare providers with a single-word query: “social.” Using QuickCode, the scientist then selected 66 recommended labels, leading to the discovery of 33,210 relevant documents in less than 15 minutes.

QuickCode also allowed the data scientist to share the validity of their data labeling findings with other subject matter experts, and select more precise labels based on their input.

Challenge

How do you train and refine a machine learning model to identify complaints from your customers?

You want to use a machine learning application to correctly identify and route messages from your customers to the relevant departments. But how do you teach your model to sort customer messages appropriately? Even if you are able to build a classification system, how can you easily explain your labeling criteria to supervisors and others?

Solution

Use QuickCode to create labeled training data that can be used to train machine learning models, while also providing the transparency needed to discuss and refine your work with others.

Example

A user wanted to train a model to recognize complaints of cyber theft. The user started with a single-word query—"hack"—which they used to search 160,000 customer messages collected from over 3,000 financial services. Using QuickCode, they iterated through the recommended labels and expanded their training dataset in less than 10 minutes to more than 50 times as many complaints. The expanded data set also had more than 10 times as many affected financial institutions as the original set, providing a robust selection of training data with which to build predictive models. The labels also provided transparency, allowing the user to discuss their labeling decisions with supervisors and adjust based on their input.

Challenge

Are agencies making mission-critical decisions with incomplete data?

Government data scientists, analysts, lawyers, and researchers share many of the same challenges as their private sector counterparts. But some of their needs are different. Their analysis, models, and predictions shape policies that affect citizens’ lives and inform national security. When the stakes are this high, you need to have the most complete data set possible.

Solution

QuickCode helps your agency’s data experts quickly and transparently curate datasets for analysis, modeling, and prediction. And they can use QuickCode with your data on your cloud. Better words mean better data and better data mean better predictions.

Examples of How Thresher’s QuickCode Supports Agencies’ Missions 

1) Finding codewords to better understand sensitive online conversations
2) Categorizing writings about suicide bombings and domestic violence
3) Labeling foreign language texts by dialect for better sentiment analysis
4) Creating labels from the slang used to talk about drugs and human trafficking online

Security First

Thresher's QuickCode was built from the ground up with security in mind. We work in the most sensitive environments across intelligence, defense, and civilian government agencies.

·      Install QuickCode on-premise behind your firewall
·      Leverage QuickCode in the cloud through Amazon GovCloud
·      Compliant with FISMA and NIST standard protocols

We are proud recipients of contracts from the DARPA-sponsored Small Business Innovation Research (SBIR) program. Their support is an important part of our broader commitment to continuous innovation and rigorous testing of our core technologies. 

Working With Government Agencies

Thresher is a U.S. Small Business Administration (SBA)-certified small business with a robust federal partnering ecosystem. Contact us today to get started with a proof of concept or pilot program.

We believe that combining what computers do best with what experts do best creates sharp insight.

READ OUR STORY