Programming, Photography, Guides, Thoughts

Robot in front of a screen with a paper box full of hash symbols in the middle

Streamline adding hashtags to Instagram posts with ChatGPT


As a hob­by pho­to­gra­pher, I like to show­ca­se my pho­to­gra­phy on Instagram. But I hate coming up with hash­tags for my posts. So as a soft­ware engi­neer and LLM enthusi­ast, I thou­ght – Can I auto­ma­te and stre­amli­ne the pro­cess of coming up with hash­tags for my Instagram posts with ChatGPT?

Hash­tags are a gre­at way to attract rele­vant audi­en­ces. Unfor­tu­na­te­ly my pho­to­gra­phy sty­le is qui­te broad and the­re isn’t one set of hash­tags that would fit most of my pho­tos. So I often have to come up with a unique set for eve­ry pho­to. One could have a list of frequent­ly used tags. But even so, coming up with an appro­pri­a­te com­bi­nati­on is alrea­dy a non-tri­vial task.

ChatGPT to the rescue

Chat­GPT Plus or higher sub­scrip­ti­on, star­ting from $20 at the time of wri­ting, gives you access to three cru­cial functions:

  • Code inter­pre­ter – to extract came­ra model and locati­on infor­mati­on from the pho­to’s EXIF and XMP
  • Image ana­ly­sis – to descri­be in words what is visi­ble in the image
  • GPT‑4 model – to com­bi­ne the came­ra model, locati­on and descrip­ti­on and select appro­pri­a­te hashtags

Getting the right ChatGPT prompt

Ini­ti­ally, I was­n’t sure how to ask Chat­GPT to ana­ly­se the image so that the out­put would be use­ful for the later tasks. So I used Chat­GPT to help me to con­struct the prompt:

Generate a prompt for GPT-4 that given an image will:

1. Describe the image
2. Generate 30 Instagram hashtags based on the description.

The sug­ges­ted prompt was a good start:

Given an image, perform the following tasks:

1. Describe the image in detail. Focus on identifying the main subjects, objects, setting, and any notable features or actions taking place within the image. Include colors, textures, emotions, and any other relevant details that capture the essence of the image.

2. Generate 30 Instagram hashtags based on your description. Create hashtags that are relevant to the main subjects, objects, and themes identified in your description. Ensure these hashtags are varied and cover different aspects of the image, such as location, mood, specific objects or subjects, events, or any unique details. The goal is to make these hashtags useful for categorizing the image on Instagram, helping it reach a relevant audience.

I ended up expan­ding it a litt­le bit. For Czech con­tent it gene­ra­ted hash­tags with non-latin cha­rac­ters, which are allowed but rare­ly used. Some of the tags were also exces­si­ve­ly long, which is also less like­ly to be sear­ched for.

Prefer shorter hashtags as they are more likely to be typed by someone in search. Hashtags consist of only lower case latin letters.

Giving ChatGPT additional context

I use Ado­be Ligh­tro­om for pro­ces­sing my pho­tos. It is very easy to add addi­ti­o­nal infor­mati­on that is embed­ded to the pho­to file and can be acces­sed by Chat­GPT later. I was par­ticu­lar­ly inte­res­ted in two pie­ces of information:

  • Whe­re was the pho­to cap­tu­red (to be able to gene­ra­te locati­on spe­ci­fic hashtags)
  • Came­ra model (to be able to get gene­ric hash­tags for the came­ra make)

The code inter­pre­ter in Chat­GPT can run arbit­ra­ry Python scripts. First, I wan­ted to read the locati­on infor­mati­on. That is avai­la­ble as XMP, which is just XML embed­ded into the JPEG file. Very easy to read with a regu­lar expres­si­on, using the re modu­le. The locati­on is sepa­ra­ted into Country, City and Location fields. The following script can read them from a file­na­me rece­i­ved as the first com­mand line argument:

import re
import sys
with open(sys.argv[1], 'rb') as f:
    data = f.read()
print(
    "\n".join(
        [
            _.decode().replace('"', '').replace("=", ": ")
            for _ in re.findall(b'(?:City|Country|Location)="[^"]*"', data, re.DOTALL)
        ]
    )
)

I saved this script as read_location.py to be used later.

Next, I wan­ted to get the came­ra model. That infor­mati­on is avai­la­ble as an EXIF tag num­ber 272. Unfor­tu­na­te­ly, EXIF is a bina­ry for­mat and not as strai­ght­for­ward to read. So I asked Chat­GPT aga­in to gene­ra­te code to read the EXIF tag:

import sys
def read_exif_tag(jpeg_path, tag_decimal):
    def bytes_to_int(bytes_val, little_endian=True):
        return int.from_bytes(bytes_val, 'little' if little_endian else 'big')
    with open(jpeg_path, 'rb') as f:
        data = f.read()
    app1_start = data.find(b'\xFF\xE1')
    if app1_start == -1:
        return "APP1 segment not found, maybe no EXIF data."
    exif_header = b'Exif\x00\x00'
    if data[app1_start+4:app1_start+10] != exif_header:
        return "Not an EXIF segment."
    byte_order = data[app1_start+10:app1_start+12]
    little_endian = byte_order == b'II'
    # TIFF header starts immediately after 'Exif\0\0' and byte order mark
    tiff_header_start = app1_start + 10
    ifd_offset = bytes_to_int(data[tiff_header_start+4:tiff_header_start+8], little_endian) + tiff_header_start
    num_ifd_entries = bytes_to_int(data[ifd_offset:ifd_offset+2], little_endian)
    for i in range(num_ifd_entries):
        entry_start = ifd_offset + 2 + (i * 12)
        tag = bytes_to_int(data[entry_start:entry_start+2], little_endian)
        if tag == tag_decimal:
            # Type of the data stored in this tag
            data_type = bytes_to_int(data[entry_start+2:entry_start+4], little_endian)
            # Number of components of the given type
            num_of_components = bytes_to_int(data[entry_start+4:entry_start+8], little_endian)
            # Calculate the data length based on type; assuming type 2 (ASCII string)
            # For simplicity; different types have different sizes
            data_length = num_of_components
            # Offset or value directly
            value_offset = bytes_to_int(data[entry_start+8:entry_start+12], little_endian)
            # Check if value is offset or directly in the 4 bytes
            if data_length > 4:
                # Adjust offset relative to TIFF header start, not EXIF header start
                absolute_offset = tiff_header_start + value_offset
                model_name = data[absolute_offset:absolute_offset+data_length].rstrip(b'\x00').decode()
            else:
                model_name = data[entry_start+8:entry_start+8+data_length].rstrip(b'\x00').decode()
            return model_name
    return f"EXIF tag {tag_decimal} not found."
print(read_exif_tag(sys.argv[2], int(sys.argv[1])))

I saved this script as read_exif_tag.py to be used later.

Reposting accounts

The­re are also pho­to aggre­ga­ting accounts that follow a spe­ci­fic hash­tag. These hash­tags are good to add if the pho­to matches the topic. If such an account repo­sts your pho­to, you can gain sig­ni­fi­cant expo­su­re. To make sure these hash­tags are alwa­ys inclu­ded, I added the following instructions:

Make sure you include the following hashtags for these photos:
- predominantly yellow: #ayellowmark
- minimalistic: #soulminimalist, #minimalint, #ig_minimalshots
- architecture: #lookingup_architecture, #creative_architecture. #tv_pointofview
- czech landscape: #ceskakrajina

Commonly used hashtags

I was not com­ple­te­ly satis­fied with the hash­tags that were gene­ra­ted based on the pre­vi­ous prompts. They loo­ked sensi­ble but were still ran­dom and rare­ly had any pho­tos posted to them. So I thou­ght it would be bet­ter to have a pre-selec­ted list of many exis­ting hash­tags for the AI model to pick from instead.

I went throu­gh about 60 accounts I follow for inspi­rati­on, ope­ned the­ir posts and copied out hash­tags they used into a text file. I did­n’t really care about for­mat­ting or dupli­ca­tes at this point. When I felt I had a rea­so­na­ble amount cove­ring wide ran­ge of topics, I repla­ced all spa­ces with new line cha­rac­ters, sor­ted the list and remo­ved dupli­ca­tes. I was left with 456 hash­tags in a text file, one per line. I save this file as hashtags.txt to be used later.

Custom GPT

To put all the prompts, scripts and com­mon hash­tags toge­ther in a reu­sa­ble and easy to use way, I deci­ded to uti­li­se a fea­tu­re of the Chat­GPT Plus sub­scrip­ti­on called GPTs. Rather than having to type the instructi­on eve­ry time, I just want to go to the cus­tom GPT from the left hand panel (or even refe­ren­ce it in a chat with @gpt_name) and uplo­ad a pho­to. It should then per­form all the sto­red instructi­ons and I should be left with an appro­pri­a­te list of hashtags.

In order to achie­ve that, first, you need to cre­a­te a new GPT. You can do that by clic­king at your name in the bot­tom left cor­ner and going to My GPTs / Cre­a­te a GPT / Con­fi­gu­re tab. In here you give it a name, short descrip­ti­on and most impor­tant­ly, the instructi­ons on what to do.

I put toge­ther instructi­ons from all the pre­vi­ous secti­ons into a sin­gle step-by-step list:

Make sure you include the following hashtags for these photos:

- predominantly yellow: #ayellowmark
- minimalistic: #soulminimalist, #minimalint, #ig_minimalshots
- architecture: #lookingup_architecture, #creative_architecture. #tv_pointofview
- czech landscape: #ceskakrajina

---

Given an image, perform the following tasks:

1. Using the code interpreter, run `!python /mnt/data/read_location.py {image_file_path}` to extract location from the image file.

2. If the location is empty, ask about the where the photo was taken. Wait for user input before continuing to the next task.

3. Using the code interpreter, run `!python /mnt/data/read_exif_tag.py 272 {image_file_path}`to extract the camera model from the image file.

4. If the camera make model could not be extracted, ask about the camera model used to take the photo. Wait for user input before continuing to the next task.

5. Describe the image in detail. Focus on identifying the main subjects, objects, setting, and any notable features or actions taking place within the image. Include colors, textures, emotions, and any other relevant details that capture the essence of the image.

6. Generate 30 Instagram hashtags based on your description, the location, the camera model and the suggested hashtags from the "hashtags.txt" file. Create hashtags that are relevant to the main subjects, objects, and themes identified in your description. Ensure these hashtags are varied and cover different aspects of the image, such as location, mood, specific objects or subjects, events, or any unique details. The goal is to make these hashtags useful for categorizing the image on Instagram, helping it reach a relevant audience. Prefer shorter hashtags as they are more likely to be typed by someone in search. Hashtags consist of only lower case latin letters.

As you can see, I added a wor­ka­round for cases, when pho­tos do not con­ta­in locati­on or came­ra model. If that hap­pens, this GPT will ask you to pro­vi­de them in the chat.

Next, you need to uplo­ad all the sup­por­ting files – the Python scripts and the text file with com­mon hash­tags into the Knowled­ge section:

  • read_location.py
  • read_exif_tag.py
  • hashtags.txt

In Capa­bi­li­ties, I ena­bled only the Code Inter­pre­ter.

And that’s it. I can try how well it works in the Pre­view panel on the right side to see how well it works and if I need to do any adjust­ments to the instructi­ons, the scripts or the hash­tag data.

Publishing

I deci­ded not to pub­lish this GPT. Whi­le the appro­ach is reu­sa­ble, the hash­tag data­base is tai­lo­red to my needs. And having to pro­vi­de your own data­base on eve­ry que­ry would defe­at the pur­po­se. What could be even­tu­ally pub­lished is the GPT without the data­base, which could be then used by per­so­nal GPTs pro­vi­ding the mis­sing hash­tag database.

Pricing

If you don’t want to pay a $20 mon­th­ly sub­scrip­ti­on for Chat­GPT plus, you can also get the access to the visi­on and GPT‑4 models via the Ope­nAI API:

  • gpt‑4–1106-vision-preview
  • gpt‑4

I opted for GPT‑4 as oppo­sed to che­a­per GPT‑3.5 (or non-Ope­nAI alter­na­ti­ves, such as Octo­AI Lla­ma 2) as the per­for­man­ce was noti­ce­a­bly better.

Let’s use the following image as an example:

Bre­akdown of cost of using the Ope­nA­PI, using the offi­cial Ope­nAI toke­ni­zer to esti­ma­te the num­ber of tokens:

TaskTokensPri­ce per 1K tokensCost
Visi­on model input
1440x1920px image765$0.01$0.00765
Text instructi­ons49$0.01$0.00049
Visi­on model output
Image descrip­ti­on241$0.01$0.00241
GPT‑4 input
Image ana­ly­sis context241$0.03$0.00723
Locati­on context15$0.03$0.00045
Came­ra model context7$0.03$0.00021
Repo­sting accounts71$0.03$0.00213
List of 456 com­mon hashtags2258$0.03$0.06774
Text instructi­ons127$0.03$0.00381
GPT‑4 out­put
Hash­tag suggestions192$0.06$0.01152
Total$0.10364

If we divi­de the cost of the mon­th­ly Chat­GPT Plus sub­scrip­ti­on of $20 by pri­ce of ana­ly­sing a sin­gle image via the API, we can ana­ly­se about 192 images!

Final thoughts

Handling images of varying quality and complexity

The visi­on model’s per­for­man­ce in ana­ly­sing images and gene­ra­ting hash­tags lar­ge­ly depends on the qua­li­ty and com­ple­xi­ty of the images it rece­i­ves. Whi­le GPT models are good at extracting details from a wide ran­ge of images, extre­me­ly low-qua­li­ty images or tho­se with high­ly com­plex sce­nes may pose challen­ges. The model’s effecti­ve­ness depends on its abi­li­ty to reco­gni­se key ele­ments within an image, which can be influ­en­ced by the image’s cla­ri­ty and the dis­tincti­ve­ness of its subjects.

Potential limitations of relying on AI for hashtag generation

Rely­ing on AI for hash­tag gene­rati­on intro­du­ces a few poten­tial limi­tati­ons. First­ly, the AI’s under­stan­ding of con­text and cul­tu­ral nuan­ces might not be as nuan­ced as a huma­n’s, lea­ding to less effecti­ve or occasi­o­nally ina­ppro­pri­a­te hash­tags. Secon­dly, AI-gene­ra­ted hash­tags may lack cre­a­ti­vi­ty or per­so­nal touch that some con­tent cre­a­tors pre­fer. Last­ly, over-reli­an­ce on AI for this task could result in a homo­ge­ni­sati­on of hash­tags across dif­fe­rent posts, poten­ti­ally dimi­nishing the unique­ness of indi­vi­du­al content.

Future improvements

Futu­re impro­ve­ments to Chat­GPT’s hash­tag gene­rati­on capa­bi­li­ties could inclu­de enhan­ced under­stan­ding of current social media trends and slang, bet­ter reco­gni­ti­on of subt­le image details and the­mes, and the abi­li­ty to cus­to­mi­se hash­tag sug­ges­ti­ons based on user pre­fe­ren­ces or past suc­cess­ful posts. Addi­ti­o­nally, incor­po­ra­ting feedback loops whe­re the model learns from the engage­ment met­rics of posts it gene­ra­ted hash­tags for could refi­ne its accu­ra­cy and effecti­ve­ness over time, making it an even more valu­a­ble tool for con­tent creators.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.