Monday, May 15, 2023

How to use ChatGPT realistically to do daily tasks faster.



How to use ChatGPT realistically to do daily tasks faster.

This blog post is entirely written by a meat bag (that is Bender talk for 'human'). 👋

The situation.

My morning routine sometimes involves scrolling through Hacker News, and this morning I ran into a person's side project called VORE that involves making an RSS reader, not as an app, but as a webpage.

The neat thing of having an RSS reader inside a webpage is that you can access it anywhere. 

  • There is no need for an app on your phone or pc. You simply visit the URL, and there is your curated list of stuff you want to read that day. 
  • Another neat feature of simply having a URL as your personal RSS feed is that it's now trivially easy to share this URL with a friend and thereby giving them access to your entire catalog.
  • The way VORE works is that the URL is simply your username, so in my case I picked the meaningless username HJ56a, so now my personal RSS feed can be viewed by the whole world by visiting https://vore.website/HJ56a

[Insert This Is The Way meme here]

The problem.

To import my list of RSS feeds that I am currently subscribed to, I had to export them first from NetNewsWire (I highly recommend it if you are on macOS or iOS). 

NetNewsWire exports to an OPML file. Vore.website doesn't import OPML files. I was stuck.

ChatGPT to the rescue?

So, I need to convert an OPML file to a list of bare RSS feeds (that is, a URL that leads to a .RSS or .XML or .ATOM page).

I remembered that I forked over $25 to OpenAI this month to try out ChatGPT4. So, I logged in and asked this:

I have an OPML file here. Do you know what that is?

The response:

Yes, I do. OPML (Outline Processor Markup Language) is an XML format for outlines (defined as "trees, where the nodes contain strings"). It was originally developed by UserLand Software for use with their Radio UserLand product. OPML files are used to exchange information between different types of applications such as RSS readers, mind mapping tools, and note-taking apps.

In the context of RSS feed readers, OPML files are commonly used to import and export subscriptions. This allows users to easily migrate their feeds between different platforms. Each entry typically includes the title of the feed, its URL, and sometimes additional metadata.

Here's a simple example of what an OPML file might look like:

<?xml version="1.0" encoding="UTF-8"?>  <opml version="1.0">      <head>          <title>My Subscriptions</title>      </head>      <body>          <outline text="Example Feed" title="Example Feed" type="rss" xmlUrl="http://www.example.com/rss"/>      </body>  </opml>

In this example, there's a single RSS feed titled "Example Feed" located at "http://www.example.com/rss".

Okay, so ChatGPT at least knows what an OPML file is, this is good to know.

Reader, be warned: here be dragons 🐉

The difficulty with ChatGPT is that you have to ✨know what to ask for✨, before asking it. It's true that chatGPT4 is now more able than previous versions to come up with stuff you didn't specify, but it's still very useful to be able to steer the conversation, as this example will highlight.

You see, ChatGPT is not really intelligent at all, it is simply a slightly more dandy sounding Google or DuckDuckGo. It doesn't have true creativity, but it's an index, if you will, of previous human ideas. As Ted Chiang (yes, the gentleman of 'Story of our Lives'/'Arrival' fame) said recently, AI presently gives a low-resolution image, or "blurry JPEGs of the web". 

This, incidentally, is also a side-plot in Guardians of the Galaxy vol. #3, but I digress.

Okay, so I know deep in my bones that there are a plethora of command line tools that have been perfected somewhere in the 70s or 80s and have since gone the way of all things beautiful and useful. They have been forgotten. You see, every generation thinks it knows best and therefore should reinvent stuff that has already reached peak-performance. Because I, however, read the classics, (the grey beards' comments on Hacker News) I happen to know of command line tools like AWKSEDGREP and CURL, to name a few. I just don't have a clue how to use them. 

This leads us to my next question to ChatGPT.

O wizard, conjure me a script? 🧙‍♂️

My next question to chatGPT was:

Can you write a simple script for me, maybe in BASH, that takes this one OPML file as an input and strips all data except for the htmlUrl part, so that afterward I just have a list of URLs left?

What I expected was some AWK, SED, GREP magic. What I got was this:

Sure, here is a simple BASH script using xmlstarlet, a command line XML toolkit, to parse the OPML file and extract htmlUrl attributes.

#!/bin/bash    # Check if file is provided as argument  if [ $# -eq 0 ]  then      echo "No arguments supplied. Please provide the OPML file as an argument."      exit 1  if    # Check if xmlstarlet is installed  if ! [ -x "$(command -v xmlstarlet)" ]; then    echo 'Error: xmlstarlet is not installed.' >&2    exit 1  fi    # Process the OPML file to extract htmlUrl  xmlstarlet sel -t -v "//outline/@htmlUrl" -n $1

Save this script in a file, make it executable using chmod +x scriptname.sh and then you can run it with the OPML file as an argument like ./scriptname.sh file.opml.

This script checks if the OPML file is provided as an argument and if xmlstarlet is installed in the system. If both conditions are met, it will process the OPML file and print all htmlUrl values, one per line.

Make sure to install xmlstarlet if it's not already installed in your system. You can install it using package managers like apt for Ubuntu (with sudo apt install xmlstarlet) or brew for macOS (with brew install xmlstarlet).

I had before this time never heard of the little and free tool names xmlstarlet, but it sounds like it's not hallucinated by chatGPT, so I ran the command  brew install xmlstarlet in my Mac terminal.

Next, I copied that little script and ran the CHMOD command. Again, I never knew you needed to give a script a +x argument (x for executable, maybe?). 

Next, I ran the command in my terminal, and lo and behold, it actually worked!

So, I copied this list of RSS feed URL's into VORE, and was disappointed that the URLs are all wrong 🥲

We don't need htmlUrl

This is where a non-technical person might have given up. But not you and me, no!

I simply assumed VORE isn't intelligent enough to go and visit all the URL's that I pasted into the box to fetch the exact RSS feed URLs. So, I opened the OPML file to find out what it actually looks like. 

As we can see here, there are two kinds of elements: htmlUrl and xmlUrl. And I just extracted precisely the wrong one.

To see if ChatGPT could help my select the right element, I then asked this:

Thanks a lottle. Can you change the script a little bit, so that it doesn't list the htmlURL elements, but rather the xmlUrl elements?

The resulting response made perfect sense. It told me to change the bash script file a bit into:

xmlstarlet sel -t -v "//outline/@xmlUrl" -n $1 

After pasting this new list into VORE I found out yet another problem. There are a bunch of Twitter URLs in the list. This made sense for NetNewsWire because until recently, there was a plug-in that would turn a Twitter feed into content for NetNewsWire, but now that most of the Twitter APIs are closed off, this no longer works. I just had never cleaned up my feed list in NetNewsWire. 

I wasn't about to do so now because deleting the source also deletes all bookmarks from that source in this app, and I don't want to lose those just yet. I could go the tedious rout where I 1. get the Nitter.net URLs for these Twitter URLs 2. and then find the old posts that I bookmarked, 3. and then bookmark them again. 

But. I'm counting on Nitter being blocked some time soon as well, so why waste the energy?

Anyway, now I needed to clean up this list, to remove all Twitter related URLs, so I turned to ChatGPT one last time:

can you add one more feature to this script? If the resulting output is simple a domain like https://twitter.com/, then I don't want to see it. So I want to filter them out

The output this time did give me a reference to a tool that I know of. Remember in the beginning where I bragged about having heard of GREP? Well, here it turns up:

Certainly. We can use a simple filtering mechanism with grep to filter out the lines that end with a domain. Here's the updated script:

#!/bin/bash    # Check if file is provided as an argument  if [ $# -eq 0 ]  then      echo "No arguments supplied. Please provide the OPML file as an argument."      exit 1  fi    # Check if xmlstarlet is installed  if ! [ -x "$(command -v xmlstarlet)" ]; then    echo 'Error: xmlstarlet is not installed.' >&2    exit 1  fi    # Process the OPML file to extract xmlUrl and filter out base domains  xmlstarlet sel -t -v "//outline/@xmlUrl" -n $1 | *grep* -vE "^(http|https)://[^/]+/?$"

The grep -vE "^(http|https)://[^/]+/?$" command is used to filter out any lines that match the regular expression pattern, which in this case is a pattern matching a simple domain like https://twitter.com/ or http://example.com. The -v option in grep is for inverting the match, meaning it will exclude lines that match the pattern rather than including them. 

So, the final version of my script now reads in the last line:

xmlstarlet sel -t -v "//outline/@xmlUrl" -n $1 | grep -vE "https://twitter.com"

The resulting list in my terminal windows was everything that I wished for. After pasting this list into VORE, I now am the proud owner of a functional RSS feed on the web. 😄


If you want to play around with this OPML to RSS feed URLs script, here it is:


You might have noticed I stuffed some Amazon Affiliate links in this post? That's on purpose. I hope to monetize this blog. I hope you don't mind me earning an honest living?

To discuss this article, you can comment at HN: https://news.ycombinator.com/item?id=35949269

No comments: