Using Integer Document IDs in RavenDB Indexes
At work, we recently moved our database from MongoDB to RavenDB. In this context, we set up a couple of indexes for all frequent queries to optimize reading documents. We also adjusted quite a few data access methods in our application to query the indexes instead of directly loading documents by their ID.
#The Issue: Indexing Troubles
When we wrote the aforementioned indexes, we ran into a problem with integer document IDs. All of our entity POCOs use an ID
property of type int
. The document IDs are standard RavenDB document names when using integer IDs: They're composed of the POCO class name and the value of the ID
property, thus making them human-readable (e.g. comments/1337).
Before we start, let me give you a quick overview over one of the indexes we're using in our application.
#Our Scenario: Indexing Comments by Topic
In our application, we have a pretty simple Comment
class, which looks as follows. Note that it actually has a couple more properties, which I omitted here for the sake of brevity.
public class Comment
{
public int ID { get; set; }
public int TopicID { get; set; }
public string Author { get; set; }
public string Text { get; set; }
}
The TopicID
holds information about the topic that was commented. Since our application requires comments to be queried by topic, we created an index which, well, indexes the TopicID
property:
public class Comments_ByTopic
: AbstractIndexCreationTask<Comment, Comments_ByTopic.QueryResult>
{
public class QueryResult
{
public int ID { get; set; }
public int TopicID { get; set; }
public string Author { get; set; }
public string Text { get; set; }
// More properties (omitted)
}
public Comments_ByTopic()
{
Map = comments =>
from comment in comments
select new QueryResult
{
ID = comment.ID,
Author = comment.Author,
TopicID = comment.TopicID,
Text = comment.Text.Value,
// More stuff happening here (loading documents, ...)
};
Index(x => x.TopicID, FieldIndexing.NotAnalyzed);
StoreAllFields(FieldStorage.Yes);
}
}
Actually, our index does a little more than shown here. We don't store the author as a string, for example, but instead an ID referencing the corresponding user document. The index then makes use of RavenDB's LoadDocument<T>
feature to pull in the author document for each comment. However, I left out this part since this post isn't about LoadDocument<T>
.
With the index defined as above, there was one problem, though: The ID
property was never part of the indexed fields and, consequently, was always 0 when queried. So, what do you do? Let's have a look at the workaround that solved the issue for us.
#Our Solution: Two ID properties
We tried different things to make the index work correctly. In the end, we created a separate DocumentID
property of type string
and told RavenDB to treat it as the document ID:
var documentStore = new DocumentStore
{
ConnectionStringName = "RavenDB",
Conventions =
{
FindIdentityProperty = prop => prop.Name == "DocumentID"
}
};
This DocumentID
property contains the full (!) ID of each document, e.g. comments/1337. Because we're only interested in the 1337 part — which is the actual integer ID we deal with in our application — we split the string when indexing the documents:
Map = comments =>
from comment in comments
select new QueryResult
{
ID = int.Parse(comment.DocumentID.ToString().Split('/')[1]),
Author = comment.Author,
TopicID = comment.TopicID,
Text = comment.Text.Value,
// More stuff happening here (loading documents, ...)
};
The Map
expression as listed above made the index work for us. We also wrote a little wrapper around the integer ID
property because we didn't want to change our codebase to use strings as document IDs:
[JsonIgnore]
private string _documentID { get; set; }
public string DocumentID
{
get { return _documentID; }
set
{
_documentID = value;
ID = int.Parse(value.Split('/')[1]);
}
}
[JsonIgnore]
public int ID { get; private set; }
While the solution may seem a little hacky, it works smoothly. Please note that if you're using this DocumentID
property, the corresponding documents' names all have to follow the <collectionName>/<ID>
pattern.