paperlined.org
apps > reddit > technical
document updated 6 years ago, on May 5, 2017
Notice: Some major changes were announced 5/4, and it was also announced that Cloudsearch functionality will be eliminated soon.



intro

Under the hood, Reddit has a different search engine syntax than the one people normally use. It provides a few extra features. So far there hasn't been much documentation about how to use it. Hopefully this will make it easier for others to use it.

The key to accessing the alternate search syntax is to add &syntax=cloudsearch to the end of your search URL. You will need to modify the search query in the URL bar only, because searching from within the webpage will end up removing the &syntax=cloudsearch parameter.

It may be more convenient to use the box at the top of this page to enter your query.

cloudsearch reference

Here's the CloudSearch manual [2], it's useful.

word prefix 'jailb*'
phrase '"why you no"'
number range 1000..2000
open-ended number range 40000..
boolean operators
(prefix notation)
(and 'jailb*' (not 'jailbait'))

If you get the error

that means there's a syntax error in your query. Although it still displays search results, it has dropped back to the normal search syntax (Lucene syntax), and really isn't doing what you want.

search fields

field description example
text: searches multiple fields — title, author, subreddit, and selftext
(this is the field used when you do a basic search)
text:'shitty_watercolour'
title: thread title title:'egg'
selftext: body of a self-text thread selftext:'cheese'
author: author's name author:'NY1227'
site: website site:'youtube.com'
url: url:'twitter.com/wojyahoonba'
flair: multiple fields — flair_text + flair_css_class
flair_text: flair attached to the story flair_text:'dry'
flair_css_class: CSS class associated with the flair (and subreddit:'gametales' flair_css_class:'table')
subreddit: subreddit it was posted in (or subreddit:'pics' subreddit:'aww')
 
timestamp: Unix time the thread was created timestamp:1356998400..1357084799
num_comments: number of comments (note: may give results with fewer comments when comments have been deleted) num_comments:40000..
nsfw:
over18:
is the thread marked NSFW? (0=SFW, 1=NSFW) (and title:'fruit' nsfw:1)
self:
is_self:
is the thread a self-thread? (0=link, 1=self post) (and subreddit:'aww' is_self:1)
top: total score
downs: number of downvotes downs:4000..
ups: number of upvotes ups:13000..
sr_id: base36 decoded version of the subreddit ID sr_id:4606680
(4606680 = "t5_2qqjc" = /r/todayilearned)
fullname: the fullname of the thread (should begin with "t3_") fullname:'t3_19990k'

In fact, this is how things like multireddit searches and searches over "this week" are done — a subreddit: or timestamp: field is added onto the user's query.

subreddit fields

You can also use CloudSearch syntax when searching subreddits.

field description example
text: searches multiple fields — name, title, and description
(this is the field used when you do a basic search)
language: find subreddits for a specific language
(although it seems to be woefully incomplete)
de, es
name: No idea. How is this different from description: ? [1]
description: No idea. [1]
sidebar: Very rarely but sometimes matches. Maybe this was disabled at some point, but it still has some stale data in the index? [1]
header_title: This one seems to never match. [1]
activity: ?
link_type: ?
subscribers: ?
title: ?
type: ?
nsfw:
over18:
?

relationship between CloudSearch and normal search

When someone does a normal search, their query gets converted to an Amazon CloudSearch query, and the CloudSearch engine does the work of finding the results. The query syntax conversion is partly done to provide continuity with Reddit's older search engine (Apache Lucene), and partly because the Lucene search syntax is much easier for regular users to use.

The public-facing syntax currently in use is actually parsed by the Whoosh library (which uses a search syntax that's "very similar to Lucene"), and then the L2CS library takes the output from the Whoosh parser and builds a CloudSearch query.

Note that there's a serious bug in the syntax-converter library.

Reddit Analytics

There's a related feature called redditanalytics.com. It's currently in pre-beta, and says it "should be finished by October 1, 2014", but some functionality has been made available since early 2014.

One benefit is that it allows you to search comments.