The URX Debrief - August 14

Fleksy partners with URX to bring relevant actions to its keyboard app

Fleksy transforms its keyboard into a discovery platform by integrating deep links powered by URX.— Tech Crunch

App indexing & the new frontier of SEO

Google works to close the gap between app and web content, with the goal of making mobile interactions more seamless. — Search Engine Land

When one app rules them all

WeChat’s “app-within-an-app” strategy provides an interesting contrast to the “app constellations” that have emerged in the West. — Andreessen Horowitz

Sworkit + Spotify = more completed workouts

Spotify integrations help drive 40% increase in workout completion for Sworkit users who listen to music when they work out. — URX Blog

The adblocking revolution is on the horizon

iOS9 may legitimize ad blocking technologies, which could present problems and opportunities for industry stakeholders.— The Overspill

Apple & Google are a step ahead in knowing what you want

Apple and Google make strides in building mobile discovery layers that anticipate user needs and recommend actions that will address those needs. — Wall Street Journal

Fleksy Partners with URX to Bring Relevant Actions to World’s Fastest Keyboard

The ability to immediately take action on whatever you are doing is one of the promises of mobile. That’s why I’m so excited to share our partnership with Fleksy, an amazing keyboard app with millions of downloads worldwide.  

Today, we’re helping Fleksy users discover related actions such as discovering music, finding tickets, and getting a ride straight from their keyboard. This powerful partnership will enable Fleksy to assist users with actions never before seen on a keyboard.

Check it out:

URX wants to be the discovery layer that helps users find and take their next action. We’ve built a platform that lets developers like Fleksy build discovery into their own native experience. So with a single integration with URX, Fleksy can connect users to directly inside apps like Spotify, SoundCloud, Stubhub, and Lyft.

Check out developers.urx.com for more info on our API or our native SDKs.  Or shoot us a note at info@urx.com and we’ll be happy to chat.

John

How Sworkit helps Users Finish Workouts

Sworkit is a personal trainer app for iOS and Android that provides guided, video-based workouts that users can do anywhere.  With over 5 million downloads and a top 5 ranking in the Fitness category, Sworkit knows what it takes for you to accomplish your goals.

The secret? The perfect playlist to give you that extra motivation to get  started and keep you going.  Sworkit and URX teamed up to make it easy for users to access curated playlists that compliment their workout intensity and personalities.

Users love the music widget and are more likely to finish their workout after selecting their perfect playlist:

  • 40% increase in workout completion for users who listen to music when they workout

  • 63% of users who engaged with music widget clicked through to listen in Spotify

  • 15,000+ followers across Sworkit’s workout playlists on Spotify

And the user feedback in the reviews speaks for itself:

5 Stars - Jun 11, 2015 - "I am too am embarrassed right now to go to the gym and this app is helping me get the exercise I need without the pressure of the gym. Love this app and the fact that it is connected to Spotify for my music."


5 Stars - May 31, 2015 - "Out of all the apps I tried this one is the best. It tells you how many calories you lost plus lets you play a high intensity playlist off of SPOTIFY!!! No charge at all plus you get reward points for completing workouts doesn't matter how little the time."

Thanks to the Sworkit team for their awesome partnership and download on iOS or Android today! #NoGymNoExcuse

Sign up with URX or email us at info@urx.com to learn how you can connect users to related actions from your app.

Named Entity Recognition: Examining the Stanford NER Tagger

Overview

Recently I landed a job at URX through a data science fellowship program for people with quantitative PhDs called Insight Data Science. As part of a new initiative within the program, I was offered the opportunity to work with URX on a unique data science challenge that held real business value. The goal was to develop an Named Entity Recognition (NER) classifier that could be compared favorably to one of the state-of-the-art (but commercially licensed) NER classifiers developed by the CoreNLP lab at Stanford University over a number of years.

Project

Named entity recognition is the process of identifying named entities in text, and is a required step in the process of building out the URX Knowledge Graph. For the sentence "Dave Matthews leads the Dave Matthews Band, and is an artist born in Johannesburg" we need an automated way of assigning the first and second tokens to "Person", the fifth and sixth tokens to "Organization", and the last token to "Location". Named Entity Recognition is a notoriously challenging task in Natural Language Processing given that there are an infinite number of named entities, and there may be many ways to represent a given named entity (Dave Matthews, Dave matthews, David Matthews, etc). A good NER system must label "Dave Matthews Band" as an organization, and "Dave Matthews" as a person, even though they share two out of three tokens. The most intelligent NER taggers find ways to use features derived from surrounding tokens to make the best possible prediction about the class of a given token.

The performance of NER classifiers is evaluated similarly to traditional machine learning algorithms, using metrics such as the F1-score. However, we are concerned not only with accuracy, but also with speed. When crawling new web pages for our corpus, it is essential that entities are extracted in real-time (for more about our web crawling, please see our Science of Crawl series).

For my project, I compared the Stanford tagger with a relatively simple tagger I build during my time at Insight. Often considered the benchmark tagger in empirical research due to its high accuracy, the Stanford tagger has evolved over many years, and is based on a Conditional Random Fields (CRF) model. CRFs belong to a general class of algorithms known as graphical models, and find favour over more classical Hidden Markov Models (HMM) as they are better able to account for dependency among specific events. While a markov model may mispredict transitions via false assumptions about the observational variables, CRF's better encapsulate these posterior effects.

The inhouse CRF is built using crfsuite with a python wrapper. To train our classifier, we used a publicly available wikipedia dataset of 145K labeled sentences. Most available NER training sets are small and expensive to build, requiring manual labeling. The wiki dataset we used used was relatively large owing to the innovative and automated tagging method that was employed, taking advantage of structured hyperlinks within wikipedia.

Inhouse CRF Model Features

The features we included were remarkably simple as compared to the Stanford NER parser and included

  1. The lower case token
  2. Whether the word was uppercase or not
  3. Whether the word was titled or not (first letter uppercase only)
  4. Whether the word was a digit or not.
  5. Where in the sentence the word occurred (1=end, 10=tenth word)
  6. Analagous information for the previous 3 words and the next 3 words.

Note: this builds on the CoNLL tutorial.

In [2]:
#Code Snippet from inhouse Tagger
class CrfSuite(object):

    def __init__(self, savemodelname):
        self.savemodelname = savemodelname

    def word2features(self, sent, i):
        word = sent[i][0]
        postag = sent[i][1]
        features = [
            'bias',
            'word.lower=' + word.lower(),
            'word[-3:]=' + word[-3:],
            'word[-2:]=' + word[-2:],
            'word.isupper=%s' % word.isupper(),
            'word.istitle=%s' % word.istitle(),
            'word.isdigit=%s' % word.isdigit(),
            'postag=' + postag,
            'postag[:2]=' + postag[:2],
        ]
        def behind(features,n):

            if i > 0:
                word1 = sent[i-1][0]
                postag1 = sent[i-1][1]
                features.extend([
                    '-'+str(n)+':word.lower=' + word1.lower(),
                    '-'+str(n)+':word.istitle=%s' % word1.istitle(),
                    '-'+str(n)+':word.isupper=%s' % word1.isupper(),
                    '-'+str(n)+':postag=' + postag1,
                    '-'+str(n)+':postag[:2]=' + postag1[:2],
                ])
            else:
                features.append('BOS')
            return features

        def infront(features, n):

            if i < len(sent)-n:
                word1 = sent[i+n][0]
                postag1 = sent[i+n][1]
                features.extend([
                    '+'+str(n)+':word.lower=' + word1.lower(),
                    '+'+str(n)+':word.istitle=%s' % word1.istitle(),
                    '+'+str(n)+':word.isupper=%s' % word1.isupper(),
                    '+'+str(n)+':postag=' + postag1,
                    '+'+str(n)+':postag[:2]=' + postag1[:2],
                ])
            else:
                features.append('EOS')

Performance Test 1

After training our model, we tested both our model and the Stanford model on the dataset used at the NER competition at the Conference on Computational Natural Language Learning in 2003 (CONLL2003), which the Stanford tagger reportedly achieves high scores against.

In [3]:
from plots import compare_f1s_conll
compare_f1s_conll()

 

Perhaps not unexpectedly, the Stanford classifier is performing at higher accuracy levels than the in house classifier on this dataset. However, the CONLL2003 dataset is also relatively widely used, and its possible this data was used for training the Stanford classifier (the CoreNLP group does not indicate what data was used in training) . As such, we decided to test the two CRF classifiers on a second dataset of 16K manually annotated wikipedia sentences.

Performance Test 2

In [4]:
from plots import compare_f1s_wikigold
compare_f1s_wikigold()

 

Somewhat strikingly, our inhouse classifier improved upon Stanford's classifier for this dataset, particularly for miscellaneous entities. That the in-house CRF is performing better than the Stanford classifier here may be due to the increased similarity of the training and testing data for the in-house classifier versus that for Stanford (both training and testing data were based on wikipedia content for the in-house classifier). That our classifier sees a particular boost in performance as compared to the Stanford classifier for miscellaneous entities may be observed because the in-house CRF was trained with wikipedia data, which could have included a broader range of miscellaneous entities than the training data for the Stanford tagger. Overall, this analysis highlights a well-known problem in NER research, which is that the performance of NER systems can be strongly dependent on the data it is tested on. Just because an NER system performs well on one manually annotated training set, it may not perform as well on all such datasets, which introduces challenges as far as deriving a meaningful evaluation metric.

Speed Comparison

In the following comparison we examine the speed of the Stanford classifier, and compare it with our inhouse CRF classifier. We looked at the time to perform NER on a random set of 50, 200, 500 or 1000 web documents extracted from our search index. Each document had been previously stripped of its boiler plate html, using the dragnet python library, leaving only raw English text, which was then tokenized into sentences for NER. Each time interval was run 25 times, yielding error bars which reflect the standard errors below.

In [5]:
from speedtest import compare_speed 
compare_speed()

The inhouse CRF tagger is performing approximately twice as fast as the Stanford CRF tagger. While the Stanford classifier was not designed for speed, this nevertheless shows that it is not challenging to improve upon this classifier strictly on this level. Indeed, this is an important consideration when considering whether or not to use Stanford's NER tagger in production.

Conclusion

Overall I was drawn to URX for many reasons including (but not limited to), a) exceptional data scientists, as exemplified by Science of Crawl series, b) incredible opportunities to develop an exciting search-based product from the ground up, and c) last but not least - extremely kind and generous coworkers. Right now I'm continuing to be involved in moving our current version of NER into production, as well as helping with other science projects including, a) developing AB testing infrastructure, b) search re-ranking using crowd sourced relevancy judgments, and c) a search relevancy dashboard to evaluate our search index.

-Ben

The URX Debrief - July 21

A glimpse into the future of indexed apps

While it’s cool that Siri can accurately help us find tunes inside Apple Music, the profound change will occur when we can search inside all apps.— URX Blog

Growing the App Links community with Bing

Bing is expanding its search index to include apps and app actions so content from App Links enabled apps can appear in mobile Bing search results. — Facebook Developer Blog

App indexing & the new frontier of SEO

For those interested in the most cutting-edge digital marketing strategies, the concept of SEO will have to expand to include optimization of deep app content for inclusion in the Apple and Google indexes. — Search Engine Land

Introducing DeepLinkDispatch

DeepLinkDispatch is designed to help developers handle deep links easily without having to write a lot of boilerplate code and allows you to supply more complicated parsing logic for deciding what to do with a deep link.— Airbnb Developer Blog

The web we have to save

The rich, diverse, free web that I loved - and spent years in an Iranian jail for - is dying. Why is nobody stopping it?— Hossein Derakhshan

Is this the end for apps?

Google Now and Siri will not be the death of apps, they just might be ushering in a new era. — The Next Web

A glimpse into the future of indexed apps

Like many Americans I chose to celebrate my freedom on a road trip last weekend. I needed some tunes for the trip and it was a perfect time to try out Apple Music. I’m not a big Siri user but decided to put it to the test.

While driving I launched Siri and said “Listen to Pet Sounds”. After a slight delay, my screen updated (below) and the first track started playing.

In some order Siri interpreted my voice command, looked up “pet sounds”, determined I was looking for the influential Beach Boys album, checked to see if I signed up for Apple Music, and then started playing it immediately.

It was awesome because it just worked. “Play some summer music” yielded a solid playlist and other searches for artists or albums worked great. While it’s cool that Siri can accurately help us find tunes inside Apple Music, the profound change will occur when we can search inside all apps.

Siri and Spotlight Today

Spotlight today enables us to search through information on our device like contacts, messages, and calendar/mail (if you use the Apple apps) and increasingly offers information like weather, sports scores, and movie showtimes. It doubles as an easy way to launch apps without searching through folders and gives you the option of searching the web via Bing as well.

However it doesn’t yet have access to the content inside the apps on your phone. For example, when I search for “pet sounds” I see a link to download on iTunes (which is great) but no link for Apple Music or Spotify. Thats where I wanted to head when I entered the query.

I have no doubt Spotlight will start showing Apple Music results soon enough. But what about the other 40 apps on my phone?

The new Search APIs

At WWDC, Apple introduced Search APIs with the goal of adding app content to Spotlight and Safari search. From the WWDC page for the search API:

iOS 9 adds a variety of ways to surface the rich content in your app making search results more relevant. Gain insights into how deep app links can bring people directly where they want to go in your app, making your app’s content even more discoverable and searchable than content on the traditional web.

The APIs enable developers to:

  1. Notify the OS of the recent actions a user has taken so they can quickly navigate back to it. Think “recent history” in a web browser.
  2. Index the content inside the app on the device so it can show up in a user’s results. This would be how LinkedIn surfaces your contacts in the demo but doesn’t make them accessible to the entire world.
  3. Link web to app content. This is the same approach Facebook’s App Links and Google’s App Indexing take to the problem. These tags help Apple surface all the content in an app to users without the app installed.
SearCHING FOR POTATO BRINGS UP 3 RESULTS FROM YUMMLY

SearCHING FOR POTATO BRINGS UP 3 RESULTS FROM YUMMLY

The short version is that Apple wants to help users navigate inside apps to the right place and help users discover new ones. The APIs will also be the foundation for the “Predictive Assistant” (e.g. Google Now) functionality.

On stage, Apple showed examples of results from LinkedIn, Airbnb, and Yummly. For most developers indexing your app is a no brainer. I can’t think of many reasons why you wouldn’t want to make it easier for people to find things in your app.

A couple more examples

Here are a couple more examples of how useful search will be when it includes app content.

Take action from the apps on your phone

“I want to listen to the Giants game”

Siri: “Starting to play now from the MLB app”

The only way to listen to an MLB game online is via a paid subscription from the At Bat app. For Siri to give me this answer, she would need to know the At Bat app enables you to listen to games, find the link to the current game, and understand I paid for a subscription.

Compare information across apps (and the web)

“How much could I buy SF Giants bleacher tickets for tomorrow’s game”

Siri: “Seats range from $15–22 on Stubhub, SeatGeek, and tickets.com. Would you like me to open one or just find the cheapest option?”

In this example, Siri is able to compare actions from apps on your phone or the web. She knows I can take the desired action in each and gives me the option of which to pursue.

Discover new apps

“Get me directions for cheap parking near 184 South Park San Francisco”

Siri: There is a lot for $15 0.2 miles from 184 South Park. May I offer another option? There are two new apps, Luxe and Zirx, that offer valet services with a free trial. Loading directions now, let me know if you want to try valet and I’ll download the app to get started.

Here you ask Siri one thing, but she knows that there are other options that address the same problem. This type of contextual discovery is going to be huge for niche apps that are buried in the long tail today.

Kill the TV guide

This one is not like the others. But I want my son to grow up in a world where he doesn’t need to remember that channel 724 is ESPN or remember that The Americans is on Amazon Instant, not HBO, Netflix, or Hulu. Apps are on the way to controlling our viewing experience and basic search and navigation cannot come fast enough.

The computer from Star Trek is coming

We don’t need separate search boxes for the web, the apps on our phone, new apps inside a store, in our car, or on our television set. A single entity should be our daily assistant. Given the prominence mobile devices play in our lives, whoever controls the discovery interface is in pole position to control our entire digital interface with the real world.

Hence the upcoming Thunderdome cage match between Apple and Google (and Cortana and Amazon’s Alexa) to be our personal assistant. And while the voice input component is cool, its going to be the results that come back that determine the winner. In order for these results to be relevant and actionable, results need to include content from apps.

Let the indexing begin.

Note — I didn’t mention deep linking although the term gets thrown around in these discussions. Deep links are the infrastructure needed to directly access destinations inside apps. But deep links by themselves don’t help us find anything — its the search layer on top that creates the sweet experience.

    Web and apps make an unlikely couple #Mobilebeat

    Eric from Yelp, Ethan from Yummly, and Michael from NBC Universal joined John on stage at Mobilebeat yesterday to talk about content discovery on mobile.  The panel spurred some great discussion on engaging mobile users and some of the challenges they face.

    Much of the panel centered around the evolving relationship between web and apps.  “Apps vs the Web” is a tired argument.  But what was interesting about this panel was the quick agreement that both were important, and then an interesting discussion followed around how they work together.  There was general agreement that:

    People discover content via search on the web.  Search traffic and SEO is critical and is a primary driver of new user discovery.  Yummly in particular emphasized opportunity that still exists for mobile publishers on the web.

    Its hard to discover new content in apps you haven't already downloaded.  As an example, “The Voice” - the hugely popular TV show - is getting lost among “voice recorder apps” in the App Store.  Despite Google’s efforts today, search traffic is not linking into apps at scale.

    App users are more valuable, have higher engagement, and are easier to monetize.  However publishers need to “in the first 10 seconds” show the primary value of the app or you lose people forever.

    Encouraging mobile web users to install an app is a delicate activity.  Some people never install apps and most don’t want to install an app for a single use. However publishers have every incentive to transition as many people as possible to their app so its important to try.

    What is a publisher to do?

    Understand and accept the you need to make your content available across web and apps in the manner in which people want to consume it.  Study the context in which people find and discover your content and tailor the experience toward it.  Your web and app experience need to be overlapping, yet complementary to each other and the movement of users between the two is critical to understand.

    Yummly, Yelp, and NBCU were all going to adapt their apps to take advantage of the new app search capabilities of Siri in iOS9 and Google Now.  However they were a bit skeptical of the immediate impact of the changes, as the platform will have to teach users how to engage with a new search behavior.  In their mind, the future of content discovery on mobile is still up for grabs.

    The web vs app debate has evolved from winner takes all to a world where they will they coexist indefinitely.  Publishers need to understand how people select one or the other and tailor the experience accordingly.

    Add a 'Buy' Button to Your App (No Coding Required)

    We have a cork bulletin board here at URX where we hang print-outs of recent AppViews integrations. For the uninitiated, AppViews are creative units powered by data returned from the URX API. AppViews let developers give their users a “view” into another app using content and deep links from that app.

    Lately, we’ve noticed a pattern on our cork wall of fame: Buttons. Developers are using our API to build things like “Buy” buttons, “Play Music” buttons and “Book a Ride” buttons in their apps. These buttons extend the functionality of their apps, improving the overall experience for users.

    That’s why we’re doubling down on buttons. Today URX is releasing an enhancement to our iOS and Android SDKs that lets you create buttons in your app tied to actions in other apps -- like buying a concert ticket in SeatGeek or booking a car in Lyft -- without having to write any code and without having to build a user interface from scratch.

    On iOS (Android instructions here), simply drag and drop an object onto an interface in Xcode, enter a few words in the object’s inspector and you’re up and running: the object generates an app icon, a call to action (e.g., “Buy Tickets in SeatGeek”) and a deep link to an action in an app that's complementary to the content your user is viewing at any given moment.

    Infusing buttons with this kind of intelligence is now as easy as dragging the URX frameworks into your project in Xcode (or downloading the frameworks with CocoaPods), exposing an object’s identity inspector in interface builder and filling in custom fields like:

    1. Action Type: determine a set of apps to potentially show in the button, each related to an action like buying something, listening to music, traveling somewhere or reserving something

    2. Domain Filter: optionally specify one or more apps you would definitely like to show in the button and resolve users to

    3. Near Filter: specify a location you would like the in-app destination to be related to

    You’ll also want to include a text query with your button which you can do by right-clicking the button in interface builder and dragging its context to an element on the page that contains text.

    (STEP-BY-STEP INSTRUCTIONS FOR ADDING A URX BUTTON ON IOS)

    For example, say you’ve created a music news app and you’d like to use URX to render a button that links your users to a musician’s upcoming concerts in a ticketing app.

    Using fields in the button’s identity inspector, you can tell URX that you want your button to surface a deep link to concerts located in a specific city (Near Filter), limit tickets to those which are available in a specific app like SeatGeek or Stubhub (Domain Filter) and only show tickets for a specific musician whose name appears in the <artist-name> label on your page (Initial Query).

    You might also experiment with changing the button from one that lets users buy concert tickets to a button that lets users listen to music simply by changing the BuyAction to ListenAction (Action Filter).

    With URX, you can monitor engagement with the button including clicks on the unit, resolutions into a recipient app and down-funnel events in the recipient app -- such as sales or signups -- resulting from traffic that originated in your app.    

    Behind the scenes, the URX search engine leverages a knowledge graph that sits atop more than 100 mobile services, accepting text-based queries with information like keywords, location and time, turning those queries into entities and concepts and mapping them to the most logical next step for your users.

    The science is rather mind-blowing and it’s all in service of making your app more awesome and making it easier than ever for apps to partner with each other.

    We hope you’ll give URX Buttons a try or create a UI of your own with the JSON-formatted data returned from our API. And please give us feedback as we work to make the process of integrating mobile services into your app as seamless as possible.

    Check out the URX documentation to get started or get in touch by e-mailing us at info@urx.com.

    Other Helpful Links:

    The URX Debrief - July 1

    URX Developer Spotlight

    Marquee, Directlyrics, and Night Out are three of the latest developers to use URX AppViews to connect users to other relevant apps. — URX Blog

    Search, discovery, and marketing

    What we’re really seeing is a trade-off between two problems. You can have a list, solving discovery and recommendation ... or you can have a searchable index of everything... — Benedict Evans

    Apple and Google race to see who can kill the app first

    Our dumb, siloed apps are slowly but steadily becoming smart, context-aware services that link, share, and talk to each other without us having to necessarily see or touch those little squares. — Wired

    Return of the search wars: The rise of contextual awareness

    Search is still a fundamental part of the computing experience. It’s just that search is no longer just about going to a browser and entering a search term – even though that is likely to be many a user’s reflex action. — Extreme Tech

    Deep linking & search in iOS 9 will change everything

    In iOS 9, apps are cohesively linked together via deep links and the experience feels magical. For the first time ever on iOS, there is a fluid system in place to help you navigate between apps. — Nirav Savjani

    How Google is Taking Search Outside the Box

    But app indexing is not just Google introducing another corpus into its search engine.The mobile app-sphere is where people live these days, not so much the web. — Backchannel

    Developer Spotlight: Marquee, Directlyrics, Night Out

    Developers are using AppViews in new and creative ways all the time to connect users to relevant content or help them take action. We’re thrilled to have the opportunity to work with a growing network of partners and wanted to take a moment to highlight a few:

    Marquee: a movie discovery app that let’s you find your movie’s showtimes, learn more about your favorite actors and locate the closest theater. Now with URX, once you find your movie and theater, you can book a ride directly in Lyft from the Marquee app.

     

    Directlyrics: a destination to find lyrics to your favorite songs and read about the latest music releases. Using URX, Marquee is able to surface actions in leading native apps -- like Youtube, Spotify and Rdio -- related to the artist and the song a user is presently reading about.

     

    Nightout: a lifestyle app for discovering experiences and finding tickets to thousands of events, or creating an event of your own. Using URX, Night Out shows an eye-grabbing Lyft button on its digital tickets on iOS, Android and on the web. When event goers double-check their tickets via their mobile device on email or through the Night Out app they’ll have an easy way to book a ride, even before having to think about transportation.

     

    Integrating URX into your app or mobile site is easy. Check out the 3-min video tutorial below on adding our latest pre-built widget: Buttons. And head over to our developer docs for more info.