Table Of Contents1. Review site manually, get a lay of the site and typical page template structures 2. Crawl and extract text, meta data, metrics with ScreamingFrogScreamingFrog's API …
Continue Reading about My content analysis workflow – part 1 →
Audience-first content strategies for experts
By Jim Thornton · Last Updated:
Table Of Contents1. Review site manually, get a lay of the site and typical page template structures 2. Crawl and extract text, meta data, metrics with ScreamingFrogScreamingFrog's API …
Continue Reading about My content analysis workflow – part 1 →
By Jim Thornton · Last Updated:
We all know what PageRank is in the context of Google / SEO, but it's also an incredibly useful tool for internal purposes around website analysis and surfacing insights. CheiRank is like PageRank …
Continue Reading about Get Website PageRank and CheiRank natively in Neo4j →
By Jim Thornton · Last Updated:
How to generate a list of your published Wordpress posts by word count in SQL (PHPMyAdmin) So to get started, you will just need to drag-n-drop the How-to Schema block in the Gutenberg editor. The …
Continue Reading about Get a list of published posts by word count in WordPress database →
By Jim Thornton · Last Updated:
You have a few options here: Table Of ContentsOption 1: Use a word count Wordpress plugin and copy/paste the posts list into a spreadsheetOption 2: Crawl a list of your posts with Screaming Frog …
Continue Reading about Get a list of posts with word counts into a spreadsheet →
By Jim Thornton · Last Updated:
This guide is for logging into a platform and scraping the data you want from it. The use case we are following is logging into my email software (ConvertKit) and scraping the engagement data from …
Continue Reading about Crawl and scrape sites that require your login →
By Jim Thornton · Last Updated:
This is a nice peek into how powerful graphs are for site audits and how these capabilities are coming along for those interested in content audit + ongoing recommendations service. I'm chipping …
Continue Reading about Website audit with ScreamingFrog crawl data and Neo4j →
By Jim Thornton · Last Updated:
I spent about 20 hours trying to figure out how to properly load some crawl data into a graph database over the past two weeks. Once I hit bottom and gave up, I got the answer within 5 minutes of …
Continue Reading about Prioritizing internal redirects to fix →
By Jim Thornton · Last Updated:
Automatic crawl report tools generally give a dump of data. The better ones will seek to prioritize critical issues over items worth further investigation but generally you end up with a dashboard of …
By Jim Thornton · Last Updated:
Running multi-line cypher script via CLI is a good middle level approach between starting out and a full deployment solution. Quick Start for cypher-shell (TL;DR) Fire up Neo4j Desktop (for Mac …
By Jim Thornton · Last Updated:
There are a lot of reasons to let Google power your site search results. For one, the whole thing is complicated. Even Wordpress' built in method for ordering results in a keyword phrase query by …
Continue Reading about Powering your site search with Google search results →
By Jim Thornton · Last Updated:
A lot of the data preprocessing is trial and error. It's creating some regular expression, or relying on some method from a library I don't fully understand and then making sure it does more good than …
Continue Reading about Generate a sample CSV file with x rows from larger CSV →
By Jim Thornton · Last Updated:
Doing some research, it looks like the best practice is developing a preprocessing workflow around 1) whatever your goals for wrangling the data are and 2) what the raw data actually looks …
Continue Reading about Preprocessing data with Python for NLP Prep →