Cross-posting from Bluesky to Jekyll
As the inverse of how I set up cross-posting to Bluesky for this site (to ensure that I am publishing on my own site and syndicating elsewhere: POSSE), I have now set up cross-posting from Bluesky into this site. This inverse - archiving content from other networks into my home site - is called Publish Elsewhere Syndicate On Site: PESOS.
As part of publishing where the people are, I want to reply to Bluesky posts in Bluesky. But I still want them here for archival/linking/discovery/etc. So, I’ve set up a new section on this site for replies and threads.
As a nice bonus, I’ve also configured the script to pull in any new posts I write on Bluesky, archived here as regular blog posts. And I’ve set it up so they get syndicated back out to Mastodon if they’re “plain”: not a reply and not mentioning a Bluesky user directly.
Workflow
I tacked this on to the GitHub workflow that I run regularly to syndicate my posts out to Mastodon and Bluesky. The steps are pretty simple:
- Execute the
utilities/pesos_bluesky
script - If there are any code/content changes, commit them and push them up to the repository.
The script, written in Ruby, searches the Bluesky API for posts from my account. (This is kind of a backdoor way to get my account posts, but the Bluesky API kept returning server errors when I attempted to use their documented getPosts
feed endpoint, so I’m resorting to this search interface which produces what I want because I can filter by posting account.)
It then finds any of those posts that are replies and imports them as Jekyll posts. For every other post/status, it looks to see if there’s already a Jekyll post on the site that corresponds (has the same Jekyll post frontmatter of bluesky_status_url
). Any that are already present get skipped. Any that doesn’t gets imported as a blog post.
The script is using the little Post class I wrote for syndicating earlier/elsewhere. I also extracted the Bluesky auth helper methods into a module so it could be shared between the PESOS and POSSE scripts.
Code
require 'jekyll'
require 'json'
require 'net/http'
require 'time'
require_relative 'bluesky/auth'
require_relative 'models/post'
require_relative 'models/asset'
module PESOS
# Pulls in posts and replies from Bluesky
class Bluesky
include ::Bluesky::Auth
class Error < StandardError; end
attr_reader :site
def initialize
@site = Jekyll::Site.new(Jekyll.configuration({}))
@handle = ENV.fetch('BLUESKY_HANDLE')
@did = generate_did(@handle)
@api_key = generate_api_key(@did, ENV['BLUESKY_PASSWORD'])
@site.read
end
def import_replies
replies = outbox.filter do |item|
item.dig('record', 'reply', 'parent', 'uri')
end
replies.each { |reply| import_reply(reply) }
end
def import_posts
posts = outbox.filter do |item|
next false if item.dig('record', 'reply', 'parent', 'uri')
next false if post_exists?(resolve_post_uri(item['uri']))
true
end
posts.each { |post| import_post(post) }
end
def post_exists?(url)
slug = url.split('/').last
site.posts.docs.any? do |post|
next false unless post.data['bluesky_status_url']
post.data['bluesky_status_url'].include?(slug)
end
end
def import_reply(reply)
record = reply['record']
reply_uri = record.dig('reply', 'parent', 'uri')
title = "Reply to #{find_did_handle(reply_uri)}"
canonical = resolve_post_uri(reply['uri'])
image = find_image(record['embed'])
post = Post.new(
body: embed_facets(record['text'], record['facets']),
category: 'replies',
date: Time.parse(record['createdAt']),
in_reply_to: resolve_post_uri(reply_uri),
bluesky_status_url: canonical,
canonical: canonical,
image: image,
slug: reply['uri'].split('/').last,
tags: ['bluesky'],
title: title
)
post.create_file
end
def import_post(post)
record = post['record']
title = 'Post to Bluesky'
post_uri = resolve_post_uri(post['uri'])
canonical = nil
mastodon_social_status_url = false
image = find_image(post['embed'])
# lock it down to bluesky if directed at another user
if mention?(record['facets'])
canonical = post_uri
mastodon_social_status_url = nil
end
post = Post.new(
body: embed_facets(record['text'], record['facets']),
category: 'blog',
date: Time.parse(record['createdAt']),
hide_title: true,
mastodon_social_status_url: mastodon_social_status_url,
bluesky_status_url: post_uri,
canonical: canonical,
image: image,
slug: post['uri'].split('/').last,
tags: ['bluesky'],
title: title
)
post.create_file
end
def find_image(embed)
return nil if embed.nil?
if embed['$type'] == 'app.bsky.embed.images#view'
image_url = embed['images'].first['fullsize']
return nil if image_url.nil?
asset = Asset.new(url: image_url, category: 'images')
asset.download
asset.public_path
elsif embed['$type'] == 'app.bsky.embed.external#view'
embed['external']['thumb']
end
end
def mention?(facets)
facets&.any? do |facet|
facet['features']&.any? do |feature|
feature['$type'] == 'app.bsky.richtext.facet#mention'
end
end
end
def embed_facets(text, facets)
return text if facets.nil? || facets.empty?
facets.each do |facet|
next unless facet['features']
facet['features'].each do |feature|
if feature['$type'] == 'app.bsky.richtext.facet#link'
bytestart = facet['index']['byteStart']
byteend = facet['index']['byteEnd']
text.insert(bytestart, '[')
text.insert(byteend + 1, "](#{feature['uri']})")
elsif feature['$type'] == 'app.bsky.richtext.facet#mention'
bytestart = facet['index']['byteStart']
byteend = facet['index']['byteEnd']
text.insert(bytestart, '[')
text.insert(byteend + 1, "](#{resolve_did_to_profile(feature['did'])})")
end
end
end
text
end
def find_did_handle(uri)
return 'Bluesky' if uri.nil?
did = extract_did(uri)
return 'Bluesky' if did.nil?
resolve_handle(did) || 'Bluesky'
end
def outbox
uri = URI.parse("https://bsky.social/xrpc/app.bsky.feed.searchPosts?author=@{@handle}&q=*")
request = Net::HTTP::Get.new(uri)
request.content_type = 'application/json'
request['Authorization'] = "Bearer #{@api_key}"
req_options = {
use_ssl: uri.scheme == 'https'
}
res = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
unless res.is_a?(Net::HTTPSuccess)
puts res.body
raise Error, 'Status post request failed'
end
resp_body = JSON.parse(res.body)
resp_body['posts']
end
end
end
bluesky = PESOS::Bluesky.new
bluesky.import_replies
bluesky.import_posts
module Bluesky
# Authentication against the Bluesky API
module Auth
def generate_did(handle)
uri = URI.parse('https://bsky.social/xrpc/com.atproto.identity.resolveHandle')
params = { handle: handle }
uri.query = URI.encode_www_form(params)
response = Net::HTTP.get_response(uri)
unless response.is_a?(Net::HTTPSuccess)
puts response.body
raise Error, 'DID identification failed'
end
JSON.parse(response.body)['did']
end
def resolve_handle(did)
uri = URI.parse("https://plc.directory/#{did}")
response = Net::HTTP.get_response(uri)
unless response.is_a?(Net::HTTPSuccess)
return nil if response.code&.tos == '404'
puts response.body
raise Error, 'Handle resolution failed'
end
at_handle = JSON.parse(response.body)['alsoKnownAs'].first
"@#{at_handle.split('at://').last}"
end
def resolve_did_to_profile(did)
"https://bsky.app/profile/#{did}"
end
def extract_did(uri)
match = uri.match(%r{did:plc:[^/]*})
return nil if match.nil?
match[0]
end
# at://<DID>/<COLLECTION>/<RKEY> resolves to https://bsky.app/profile/<DID>/post/<RKEY>
# example URI:
# at://did:plc:pko7wbcggok753hnvndxh3ni/app.bsky.feed.post/3ld75432fq42c
def resolve_post_uri(uri)
did = extract_did(uri)
rkey = uri.split('/').last
return nil if did.nil? || rkey.nil?
pretty_did = resolve_handle(did)
unless pretty_did.nil?
did = pretty_did.split('@').last
end
"https://bsky.app/profile/#{did}/post/#{rkey}"
end
def generate_api_key(did, password)
uri = URI.parse('https://bsky.social/xrpc/com.atproto.server.createSession')
request = Net::HTTP::Post.new(uri)
request.content_type = 'application/json'
request.body = JSON.dump({
'identifier' => did,
'password' => password
})
req_options = {
use_ssl: uri.scheme == 'https'
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
unless response.is_a?(Net::HTTPSuccess)
puts response.body
raise Error, 'API Key generation failed'
end
JSON.parse(response.body)['accessJwt']
end
end
end
Future
Things I’d like to add in the future:
- Detect any tags used in the Bluesky post and set them correctly on the Jekyll post
- Detect any embedded videos and import them
- Use an LLM to generate a title for the post upon import