Saturday 2 January 2010

Hacking LinkedIn API

Note [2011-03-24]: As promised, I have updated the source code. Now it works! Please see my other post which supercedes this one - Hacking LinkedIn API - Take 2.

Note [2010-03-12]:LinkedIn has changed the login page(s) so that the HTML scraping code below no longer works. However, I do believe that the approach still works. I will publish an update when I have some time after hours. Stay tuned.

I was quite excited to discover that LinkedIn was exposing APIs for 3rd-party applications to tap into its data. I downloaded the LinkedIn-J 0.1 SNAPSHOT-2009-12-30 which is a beta java wrapper of the APIs from Google Code and tried the example posted at LinkedIn Developer Network forum. The sample worked quite well as intended.

However, the paradigm of the oauth procedure which is adopted by LinkedIn APIs is convoluted and the LinkedIn authorisation process assumes your application is a web application. As part of the authorisation process, it forces the user to login directly to LinkedIn web page (shown below) to get a PIN and feed the PIN to the application to continue the process.

Once logged in successfully, a PIN is given in the next HTML page:

This is no good for machine-to-machine type of integration. I just want to retrieve LinkedIn data from the backend without forcing the user to be a LinkedIn member or login twice. So I decided to bypass this extra login.

Looking (view source) at the above login form it is a simple HTML FORM:

...
         
...

Note that the FORM has the following INPUT elements (including the hidden ones and the submit button):

  1. email – use my login to LinkedIn
  2. password – use my password to LinkedIn
  3. duration – set to 0
  4. access – set to –3
  5. agree – set to true
  6. extra – empty
  7. authorize – set to Grant Access

We can easily populate this form and submit it in Java.

...
DataOutputStream dataOut;
      
        try {
            URL url = new URL(authUrl);
            //URL url = new URL("https://api.linkedin.com/uas/oauth/authorize");
            HttpURLConnection con = (HttpURLConnection)url.openConnection();
            con.setRequestMethod("POST");
            con.setUseCaches(false);
            con.setDoInput(true);
            con.setDoOutput(true);
            con.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");

            dataOut = new DataOutputStream(con.getOutputStream());

            dataOut.writeBytes("email=myname%40mycompany.com&password=mypassword&duration=0&access=-3&agree=true&extra=&authorize=Grant Access");
            dataOut.flush();
            dataOut.close();

            //SSLException thrown here if server certificate is invalid
            String returnedHtml=convertStreamToString(con.getInputStream());
            System.out.println(returnedHtml);
...

If successful, the returnedHTML will contain the PIN (as shown in screenshot above). So it is simply a matter of scraping that HTML string to get the 5-digit PIN and feed it to the next step of the authorisation process. The following is the modified Beta Java SignPost Sample Code to bypass the LinkedIn login page:

// LinkedIn SignPost Sample Code
// Adapted by Taylor Singletary
// from Twitter SignPost Sample Code ( http://oauth-signpost.googlecode.com/files/OAuthTwitterExample.zip )
// Tested and Functional with attached SignPost JAR
// YMMV

import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.InputStreamReader;
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.logging.Level;

import oauth.signpost.OAuth;
import oauth.signpost.OAuthConsumer;
import oauth.signpost.OAuthProvider;
import oauth.signpost.basic.DefaultOAuthConsumer;
import oauth.signpost.basic.DefaultOAuthProvider;
import oauth.signpost.signature.SignatureMethod;


public class Main {
    static public String getPin(String authUrl) {
        DataOutputStream dataOut;
      
        try {
            URL url = new URL(authUrl);
            //URL url = new URL("https://api.linkedin.com/uas/oauth/authorize");
            HttpURLConnection con = (HttpURLConnection)url.openConnection();
            con.setRequestMethod("POST");
            con.setUseCaches(false);
            con.setDoInput(true);
            con.setDoOutput(true);
            con.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");

            dataOut = new DataOutputStream(con.getOutputStream());

            dataOut.writeBytes("email=myname%40mycompany.com&password=mypassword&duration=0&access=-3&agree=true&extra=&authorize=Grant Access");
            dataOut.flush();
            dataOut.close();

            //SSLException thrown here if server certificate is invalid
            String returnedHtml=convertStreamToString(con.getInputStream());
            System.out.println(returnedHtml);
            /* extract the pin from the html string. the block looks like this
        <div class="content">
          You have successfully authorized MyApplication
          Please enter the following security code to enable full access

          <p align="center"><b>12345</b></p>
        </div>
             * It turns out that the whole html string only contains one 'p align="center"'
             * also it seems that the pin is always 5-digit long,
             * so we will just crudely detect that string and get the pin out.
             * A proper HTML parser should be used in a real application.
             */
            int i=returnedHtml.indexOf("center\"><b>");
            String pin = returnedHtml.substring(i+11, i+11+5);
            System.out.println("pin="+pin);
            return pin;
        } catch (IOException ex) {
            ex.printStackTrace();
            return null;
        }
    }
    public static void main(String[] args) throws Exception {
        OAuthConsumer consumer = new DefaultOAuthConsumer(
                "YourConsumerKey",
                "YourConsumerSecret",
                SignatureMethod.HMAC_SHA1);

        OAuthProvider provider = new DefaultOAuthProvider(consumer,
                "https://api.linkedin.com/uas/oauth/requestToken",
                "https://api.linkedin.com/uas/oauth/accessToken",
                "https://api.linkedin.com/uas/oauth/authorize");

        System.out.println("Fetching request token from LinkedIn...");

        // we do not support callbacks, thus pass OOB
        String authUrl = provider.retrieveRequestToken(OAuth.OUT_OF_BAND);

        System.out.println("Request token: " + consumer.getToken());
        System.out.println("Token secret: " + consumer.getTokenSecret());
/*
        System.out.println("Now visit:\n" + authUrl
                + "\n... and grant this app authorization");
        System.out.println("Enter the PIN code and hit ENTER when you're done:");

        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
        String pin = br.readLine();
*/
        System.out.println("Now getting PIN from:\n" + authUrl);
      
        String pin = getPin(authUrl);
        System.out.println("Fetching access token from LinkedIn...");

        provider.retrieveAccessToken(pin);

        System.out.println("Access token: " + consumer.getToken());
        System.out.println("Token secret: " + consumer.getTokenSecret());

        URL url = new URL("http://api.linkedin.com/v1/people/~:(id,first-name,last-name,picture-url,headline)");
        HttpURLConnection request = (HttpURLConnection) url.openConnection();

        consumer.sign(request);

        System.out.println("Sending request to LinkedIn...");
        request.connect();
        String responseBody = convertStreamToString(request.getInputStream());

        System.out.println("Response: " + request.getResponseCode() + " "
                + request.getResponseMessage() + "\n\n" + responseBody);
       
    }

    // Stolen liberally from http://www.kodejava.org/examples/266.html
    public static String convertStreamToString(InputStream is) {
        /*
         * To convert the InputStream to String we use the BufferedReader.readLine()
         * method. We iterate until the BufferedReader return null which means
         * there's no more data to read. Each line will appended to a StringBuilder
         * and returned as String.
         */
        BufferedReader reader = new BufferedReader(new InputStreamReader(is));
        StringBuilder sb = new StringBuilder();

        String line = null;
        try {
            while ((line = reader.readLine()) != null) {
                sb.append(line + "\n");
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                is.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

        return sb.toString();
    }
}

4 comments:

Raj said...

Have you tried your code recently. I am getting an authorization error. Basically not able to get pin. Any suggestions are appreciated.


Thanks,
Raj

eustachio said...

The parameters on the LinkedIn page have changed, there's a few new ones there. But even including those, I can't get it to authorise, I'm guessing there's something else they are using to try and block people from scraping the pin from the page. Trying to get it working myself, if I succeed will post up the results.

Term Papers said...

I have been visiting various blogs for my term papers writing research. I have found your blog to be quite useful. Keep updating your blog with valuable information... Regards

Matthew said...

Your code provided the exact lines I needed to get my LinkedIn API integration up and running...I wish I had found this page weeks ago. Thanks!!!