↑ Return to P35 URL Stop Words

PRIV P35a Overview

by
George Morgan
My articles
Follow on:

Page no: P35a

Use Case

Very often URLs include a post name that is far too long.

Example:

“adamsmith.org/blog/tax-spending/as-ever-the-problem-with-richard-murphy-is-that-he-has-no-knowledge-of-the-subject-under-discussion/”

As discussed here by Yoast and a Google’s Matt Cutts, a post name should possess at the most 5 words. And these words shall be meaningful.

Our URL Remove Stop Words plugin radically reduces the words in the URL. It removes typical stop words and in the Pro Version, all other words that are insignificant.

 

Workflow and Functionalities

Publication of new post, page

When user presses “publish” then the plugin modifies the URL.

Functionality not needed for Google Go-Live: The user must be able to edit the URL manually without an override. Hence when he presses “OK” in the URL or when he presses “update” for the whole post/page then the plugin does not run.

Low prio now but needed for plugin go-live

When I manually modify the URL, then the Stop Words plugin should not override it. But currently it overrides it.

The plugin should only act when I (first) enter a title without clicking into the URL

WordPress itself does not update the URL when I change the title, even if I enter more nouns.

 

Feed WordPress Integration

This plugin is integrated with all known Feed plugins like FeedWordpress.
When new posts are published via Feed WordPress then the plugin is started automatically.

 

Button: “Remove Stop Words” for all posts/pages (only pro version)

A button allows to remove all stop words for all posts and pages. The process is a little bit long. It depends of the added posts, but most of the time is between 2 and 5 minutes.

 Remove Stop Words

Button: “Restore old URL based on titles”  (only pro version)

The user may decide to restore old URLs for all posts or all pages.
This overrides the URLs based on the words in the title. Then all stop words will be again in the URL.

Restore old URL based on Title

Choice for date range of posts and pages (only pro version)

Users will need to specify the date range or category of stop words changes, similarly as Media Tools does.

This is also a work-around when the Feed WordPress plugin has issues.

 Date Range Stop Words Pro

 

 

Avoid duplicate URL (pro and light version)

We must check if the generated URL does not exist yet. This is done by default in WordPress, but our plugin does not do this. It might call the core WP procedure for this check.

Decision: WordPress does not use full URL, but only the postname to identify a post. For performance reasons we prefer to use the post ID instead of a postname numbering. The test if the post exists, is done only once.  If the post name exists already then we add the post ID.

Advantage of this procedure: we do not need to count the number of posts with the same post name.

 

Algorithm to remove the stop words

For Google Go-Live we start directly with step 3.

For step 1 and 2 see the algo that removes the final S for plural, 3rd person and genitive.

 

This algorithm can be built with a finite state machine. Here more on PHP and state machines.

State Machine

 

 

Step 3: Remove all stop words (light and pro version)

Use these stop words for the light version.

DONE

3a) Execute existing stop word lists (light version)

Add to our existing Stop Word list:

DONE

3b) New Stop Words Lists (pro version)

This will be a manual list

 

 

3c) New Stop Words List: “Null meaning nouns” (pro version)

Words: consequence, question, means, consequence, consequences, mean, year, day, month,information, possibility, image, photo, album, index

The full list of unsignificant nouns is here, Task for RAD: add the ones that do NOT have “NO”

 

Plugin Overview

We make a plugin, which helps us for making the url of the post only with relative nouns. No any prepositions or verbs. This is good, because when someone search in Google, we will have better indexation, because we show only important and relative words in the URL.

Stop Words PRO

User Interface

The plugin UI is located under Tools -> Stop Words PRO.

The lists

The default lists are divided to different type of words:

  • Usual Stop Words
  • Regular Verbs
  • Past forms of regular verbs
  • Irregular verbs
  • Irregular Verbs
  • Prepositions
  • Adjectives
  • Adverbs

The lists are predefined and we don’t need to add them manually. We can change them or add new words in them. The words must be added in specific format. With coma and space after every word. The last word MUST be without comma or space.

 

When finished, the user presses “update”.

 Stop Words List

Stop Words List

Installation

The installation process is simple. The only thing which we can do is to upload the plugin via the browser or ftp. After that we need to install it via Plugins -> Installed plugins and that’s it. The plugin will automatically start working and get integrated with the feed plugins.

 

Important: The default lists above are contained the installation procedure.

stop words pro

Stop Words Pro: Installation

New Tab Stop Word

New Tabs Stop Words
P03a New Tabs Stop Words

- Click to enlarge

 

 

 

 

Costs of Pro Version

and the paid version will be $3-5 and we will give the paid version as gift with promos and etc.

so group of words is Pro version

 

Additional functionality (later)

Will not do:  synonym remover!  (would be ultimate version)

 

Tags:

See more for P35x URL StopWords