Wednesday, May 24, 2017

Narrowing WordPress Search

Narrowing WordPress Search

One thing you should know about me is that while I'm pretty savvy when it comes to building things for the web today, I'm still pretty new to it. I spent my childhood building theme parks in Roller Coaster Tycoon, not Flash games for Newgrounds.

Rather, I gravitated toward web development during the final year of college and because of that, I am a bit of a superstitious developer.

That is, I tend to fall into the trap of thinking any code I don't understand backwards and forwards must be written with black magic. In particular, the PHP function sprintf held sway over my soul for quite some time before I bothered to learn what it did.

It's my belief that there are a lot of developers like me out there, with the mindset of: "This thing works pretty well without me learning about it, so I'll just leave it alone."

What I've found is that as soon as I take a good look at an intimidating piece of code, it starts to make sense. But instead of learning from that, I go and get petrified over some other piece of code the very next week.

If you get nothing else from this article, remember this: don't be a superstitious developer! Fight the fear! It's not as scary as you think. To demonstrate, I'd like to talk more about an important but often overlooked part of WordPress: the search.

Demystifying the WordPress Search

The biggest problem with the WordPress search isn't in the code. The problem is that nobody is willing to give it a chance.

Searching for "How does WordPress search work" in Google is a frustrating experience. Seven out of the first ten results are "How to replace WordPress search." Developers have decided it's not worth understanding even the basics of WordPress search.

This is how a WordPress search works: you give the search a keyword, and it tries to match it in a post title, post excerpt, or post content. The easiest way to unpack this is by looking at the request property of $wp_query:

SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts WHERE 1=1 AND (((wp_posts.post_title LIKE '%wat%') OR (wp_posts.post_excerpt LIKE '%wat%') OR (wp_posts.post_content LIKE '%wat%')))

So search is just another WordPress query, just like the ones that we use all the time to get posts! Like the proverb about wrenches vis-á-vis ball-dodging, if you can configure a post query, you can configure a search query.

Why Would You Want to Improve the Search Query?

The search query has been giving developers angst, poor sleep and skin lesions for years now (well, angst definitely, the others are just a fair guess). There's nearly 8,000 questions on Stack Overflow alone regarding "WordPress search."

The classic issue you'll run in to is what I would call "extending" the search query: making it search more than just the post title, excerpt or content. The reason that that can be difficult is that searching over things like custom meta keys can balloon out of control quickly.

The Danger of Over-Scaling the Search Query

In SQL terms, that means that first you'd have to combine the wp_posts table with the wp_postmeta table (called a "join") and then search over meta values as well, like this:

SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts INNER JOIN wp_postmeta ON wp_posts.ID = wp_postmeta.post_id WHERE 1=1 AND (((wp_posts.post_title LIKE '%wat%') OR (wp_posts.post_excerpt LIKE '%wat%') OR (wp_posts.post_content LIKE '%wat%') OR (wp_postmeta.meta_value LIKE '%wat%')))

This query doesn't really scale well because there are a ton of junky meta items, like "_edit_lock" or "_edit_last," or "_thumbnail_id," in other words, fields that don't have words to them. Fields that are useless to search over.

Scaling the Search Query Appropriately

Now, if every post had about 10 custom meta values that you didn't need to search over, that could get out of hand pretty quick. That's why something like this is more handy:

SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts INNER JOIN wp_postmeta ON wp_posts.ID = wp_postmeta.post_id WHERE 1=1 AND (((wp_posts.post_title LIKE '%wat%') OR (wp_posts.post_excerpt LIKE '%wat%') OR (wp_posts.post_content LIKE '%wat%') OR (wp_postmeta.meta_key = 'genre' AND wp_postmeta.meta_value LIKE '%wat%')))

The problem with that is that you'd then have to pick and choose meta keys to search over, basically creating a whitelist of keys. I don't know about you, but that sounds like a lot to manage in the long run.

The good news is, there are plenty of plugins out there that help you do just that. If you have a lot of text-heavy custom meta on your site, I'd definitely recommend getting a search extending plugin to help your users find what they're looking for.

The Case for Narrowing the Search

What you don't hear about as often is narrowing the WordPress search, sometimes called "faceted search." What you have to remember about the WordPress search query is that it's not smart. The search query is just a simple text comparison, it's not a Google algorithm.

Therefore, there's a danger that you could end up hitting users with so many matches that it's impossible for them to figure out which page has the information they're looking for.

More importantly, your search page is your last line of defense for website navigability. Chances are that if users are going to the search box, it's because they can't find what they're looking for anywhere else and they're gambling that the search will lead them to the right page faster.

In that case, there are a couple of situations where I think it could make sense to narrow the search:

  • A specific post type.
  • A specific post type plus a taxonomy term.
  • A specific page and its children.
  • Narrowing the Search Query for a Post Type

    Setting the WordPress search form to limit its search to a specific post type is a matter of changing the action attribute. Normally, a WordPress search form would look like this:

    <form role="search" method="get" class="search-form" action="https://www.joshsmoviewebsite.com/">

    To modify it to search over a custom post type that has an archive named "movies," you would add the archive slug on to the end of the action URL, like this:

    <form role="search" method="get" class="search-form" action="https://www.joshsmoviewebsite.com/movies/">

    You can narrow the search down further with custom taxonomies. For instance, I want to let people filter search results by movie genre as well. To do that, I'd include a select box for the taxonomy named genre inside of the search form using wp_dropdown_categories. The resulting search form markup would look like this:

    <form role="search" method="get" class="searchform" action="https://www.joshsmoviewebsite.com/movies"> <label class="screen-reader-text" for="s">Search for:</label> <input type="text" value="" name="s" id="s" size="1" placeholder="Search Movies"> <select name="genre" id="genre" class="postform"> <option value="indiana-jones">Indiana Jones</option> <option value="not-indiana-jones">Other</option> </select> <button type="submit">Find Movies</button> </form>

    This way, WordPress will add the genre value to the search query string, like this:

    https://www.joshsmoviewebsite.com/movies?s=crusade&genre=indiana-jones

    And take a look at how that changes the SQL request:

    SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts LEFT JOIN wp_term_relationships ON (wp_posts.ID = wp_term_relationships.object_id) WHERE 1=1 AND ( wp_term_relationships.term_taxonomy_id IN (81) ) AND (((wp_posts.post_title LIKE '%crusade%') OR (wp_posts.post_excerpt LIKE '%crusade%') OR (wp_posts.post_content LIKE '%crusade%'))) AND wp_posts.post_type = 'movie' AND (wp_posts.post_status = 'publish') GROUP BY wp_posts.ID ORDER BY wp_posts.post_title LIKE '%crusade%' DESC, wp_posts.post_date DESC LIMIT 0, 10

    Not only does the post type have to be a movie, but it also has to have a taxonomy term with an ID of 81 for our Indiana Jones genre.

    Narrowing the Search Query Based on Page Ancestor

    I ran in to a use case for this while the company I work at was building a website for our city, Winter Haven. As part of the project, we combined 6 websites into one. The problem with that, though, is that we wound up with a lot of content; more than 250 pages the last time I checked.

    To make search manageable for users, I worked on a way to limit the search based on the page ancestor. For example, if you search the Fire Department section, you'll only get results from sub-pages of the top-level Fire Department page.

    By default, the WordPress search scans through all posts in the database for matches, but you can limit this by adding an action to pre_get_posts.

    First, I added a hidden input with the highest-ancestor page ID in the search form:

    <?php if ( is_page() ) { $current_post = get_post(); $ancestors = get_post_ancestors( $current_post ); $highest_ancestor_id = array_pop( $ancestors ); echo '<input type="hidden" name="section" value="' . $highest_ancestor_id; . '">'; } ?>

    Then, I adjusted the query in pre_get_posts like this:

    add_action( 'pre_get_posts', function ( $query ) { if ( $query->is_admin ) { return $query; } if ( $query->is_search ) { if ( isset( $_REQUEST['section'] ) ) { $section_id = $_REQUEST['section']; } else { return $query; } // Get all the pages to filter against. $my_wp_query = new WP_Query(); $all_wp_pages = $my_wp_query->query(array('post_type' => 'page', 'posts_per_page' => '-1')); // Get the IDs for all child pages of this section. $children = get_page_children( $section_id, $all_wp_pages ); if ( ! empty( $children ) ) { $child_ids = array(); foreach ( $children as $child ) { $child_ids[] = $child->post_id; } $query->set( 'post__in', $child_ids ); } }

    This way, users could choose to search over specific sections of the website, like Library or Parks & Recreation, rather than searching the entire site all the time.

    What I Learned From Working with the WordPress Search

    Like I said earlier, the first thing I learned is to not be intimidated by code just because I don't know how it works. Deciphering code and figuring out how it all fits together makes us better developers.

    The second thing I learned is that the WordPress search isn't bad, it's just simple. And like most things about WP, the search is very extensible. You can both broaden the search to include more fields than the default, or narrow results to a specific post type, taxonomy or even a page-specific section of your site.

    But most of all, I learned to stop leaving the WordPress search as an afterthought. Take care of your website visitors by putting some thought into how you can optimize search for them.


    Source: Narrowing WordPress Search

    No comments:

    Post a Comment