Automated Pettiness | Marty's List

Innovate / How I Automated My Pettiness

How I Automated My Pettiness

An exploration of automation and rapid prototyping

Marty

Published October 17th 2021

I was out on a walk with a friend of mine complaining about about how you can't identify when some one unfollows you on instagram.

My friend turned to me and explained how there is an app she uses that achieves such a goal. I took a look and it was exactly as she said: the app presented her with a distinct list of people not following her back. So, I downloaded and trialed a few of the apps - the whole time completely against the idea of handing over my credentials to a third-party application. To my frustration, I found that those the app claimed weren't following me back, actually were. I scoffed: "I could make something better than this!"; and always seeking an opportunity to learn, I updated my password and got to it.

Proof of Concept

My thoughts began with network traffic. I knew this wasn't going to be anything near my final iteration so I took liberty in making functional shortcuts like this for the sake of proving it could be done. I had a hunch that I could find account data in an API (application programming interface) response through using the site. I started by navigating to instagram and monitoring network traffic in the browser console as I scrolled through my followers - and my hunch proved correct. The response was a big json (javascript object notation) payload with a bunch of rubbish we didn't need. So, I went to an online json compiler and threw the response in there, chopped together a quick JQL (json query language) query to filter through to the usernames, and I was left with a small cluster of usernames - which I will obfuscate for obvious reasons in the following image; but the same query would work for anyone. This makes sense because instagram wouldn't want to take the time to render all followers in one go, as you only see 7-12 at a time in the little modal. It would make being a celebrity so hard :(

So I proceeded to go through every single network call and copy paste their respective payload into the query compiler, edited the brackets out in a text editor and ka-boom, I had my two lists of data. You can understand how draining this can get where if you had to process n / 12 network calls where n is how many followers and people you were following; this is another opportunity for improvement that I took note of. From here I quickly prototyped a comparer in an online .NET compiler and proved that we can find a delta between the two lists.

Okay cool, we have a working prototype but functionally, it's a joke. We need another way to grab data that doesn't require as many manual steps to obtain and format. This is also a good opportunity to standardise and confine some of our processes away from random websites.

Mark II

If the data isn't going to be from the network traffic, then where will we get it? Well, we can see it on the page, so it must also be in the DOM (document object model). So how do we crawl (scrape data from) a DOM?

Well, websites are a combination of HTML (hypertext markup language - how the page generally looks and is structured), CSS (cascading style sheet - how the page is styled at a more meticulous and vibrant level), and JS (javascript - the functional glue between the look and operation of the site). With this in mind, I made my next approach a javascript one. I played around for a while but ended up defining some javascript functions that selected and scrolled through elements effortlessly to construct the entire list in the DOM (in contrast to clusters), then scraped the new elements at their nested levels to return me the full list of names. I then injected these functions into my browser's console which means that I could call them from the console and they would perform their functions on the respective page. This completely removed the mundane process of manually fetching the data and individually copying them across, to a one-time copy paste and function call.

The function utilises recursion so that asynchronous sleep functions behave synchronously. As you can see, the script is simple: It finds the last element in a list and scrolls it into view. If the amount of elements in the collection is different after we scroll, we scroll again.

From here I had a basic loop to search into the child elements and return the usernames in a single list. I then pasted those names into files that would now be digested by a .NET project.

Now we've migrated the logic out of a third party website to a locally controlled environment. We integrated file searching to abstract ourselves as software users from touching the source code We have now introduced a new file called exclusions, which contains all the people we don't expect to follow us back so that we're not congesting the delta with legends like Jude Law, and Abigail Larson.

We also added very basic error handling to at least write to a separate error log file if something ever goes wrong instead of nothing.

I saved the project and lazily call it through the CLI (command line interface) with the built in dotnet commands to build and execute the code (for honestly no reason. I've just enjoyed dicking around in the CLI as of late). Now the process is: embed the crawling code on the browser and call its functions, then paste the results into our files, then run the CLI commands. We now have more secure code with increased functionality, and we need to provide less human effort.

Nice.

But we can do better. I believe that this project has potential to be entirely autonomous outside of us kicking off the process. So I moved to the Selenium framework.

Mark III

Selenium is an automated testing framework generally used by those who wish to simulate website usage to validate a myriad of things. I've implemented and worked with Selenium at two work places at the time of writing this article, so it's definitely a no-brainer for the next iteration in our prototype. The first step was to work on a little bit of the infrastructure before we got into writing code. I started by installing the Selenium NuGet package.

From here we need a WebDriver for Selenium to use. The one people generally use with Selenium is Chromedriver but you can utilise whatever browser. I placed the driver in our project directory and ran into some version issues which ended up being fixed by me just downloading an older version.

Second hurtle was logging in. Unlike our previous prototypes, I wasn't already logged in; so this required a bit more caution as my mind immediately went to instagram noticing autonomy which we will get to later. I experimented a little with ideas of logging in with refraining from manually typing in credentials as much as possible - this was just a preference. I experimented from embedding credentials in the url (which works great with windows authentication if you ever need to bypass that) to trying to connect a 'chrome profile', but ended up leveraging browser cookies instead.

For anyone not super tech savvy, a cookie is stored metadata for websites. Sometimes they store credentials and other nicknacks but often social networks store sessions against some sort of hash so that every time you close a tab you're not signed out. Instagram session cookies also expire after about a year so that's good.

After some playing, I came up with a fun system for logging in. I would navigate to instagram so that I could manage their cookies, I then tried to add my current session cookie to the browser and then would reload the page. Generally this signed me in (and another really good example of why you should lock your computer when you're not at it), but I wanted to cater for any unexpected scenario of the cookie changing or expiring.

I decided that the system would be this: We would attempt to inject the cookie into the browser to sign in, and if after a moment there was no clear indication that we successfully signed in, it would be safe to assume the cookie failed. We would then navigate to the login page and paste (reluctantly) our credentials into the appropriate fields and log in. If this ever happens, I capture the page session cookie and save it to our credentials file so that next time we can resume the existing login process. This is probably a bit of an overkill since I will only be running this code maybe monthly or bi-monthly, but still a fun learning exercise. Fun fact: while testing this project to completion, I was kicked out of my account three times and had to change my password and authenticate my identity :S

Straight off the bat, we have a more sophisticated and clever sign-in mechanic, and it works. Notice I'm using something called a 'Fluent Wait' when I select an element? This is another perk of using the Selenium framework in our .NET project. Fluent wait's are extremely useful because they will continually try to perform a function until it works. This removes the idea of waiting a fixed period of time for an element to load for whatever network related hiccup; although, after an unreasonable amount of time we do give up and abandon resourcing the project under the assumption something has gone wrong.

As you can see in the screenshot below, I've customised the fluent wait to expire after a default of 40 seconds (which I've often assigned intentionally in the code as 3 seconds for basic page searching). The fluent wait is clever in that you can set how often it performs its function (polling), and you can even set it to ignore certain exceptions which I had to do to keep the program running.

We expect it not to find a file at first hence why I ignore that - any other issue will halt the program.

After some fiddling, I got our crawling function to work. It cleverly figures out how many followers instagram thinks I have and continues to crawl until it finds that amount. Selenium doesn't have a built in scrolling function, so I embedded one into the browser with javascript. I found out the hard way that due to a bug in Instagram, the follower/following count may not match up to what is actually there - which after some research I've concluded might be relative to people deleting their accounts, I dunno for sure. So that we cover this use case, I've implemented a timeout of 6 seconds. The timeout pushes out the expiration every time we find a change in the list size due to scrolling. When no change happens, in 6 seconds we exit and continue.

After this, we perform our usual delta against the exclusions and write the output to a file in a directly I've aptly titled "Heathens". I then set properties to files like credentials and exclusions that basically ignore them being included in the project compression when we publish so that we can still modify them without having to re-publish the code. I also had to update the runtime path in the back end because once we publish the code, it isn't being run from the same directory as it is in Visual Studio.

And that's it!

We now have an application we simply click and it performs all of the hard work for us to analyse. We have watched as our concept has been rapidly prototyped from an encumbering and mostly-manual process into a sophisticated, efficient, intelligent, and simple product for cleaning social cobwebs because I'm just that guy!