My query ended up rather clunky/not-very-graceful because of these limitations, but it works. There are links lost in the chain – these people have to be added in as a parent depending on gender.Ĥ No left joins are allowed in CTEs either. This meant querying each side separately and joining the results.ģ Some people are just listed as ancestors, not parents. The graph match query mandates an equi join on either side so pulling both parents in a single query when one may not exist is not possible with this clause.Ģ The mother’s side and father’s side each can have its own branch since many characters partnered many times with multiple people. While querying this structure for constructing a family tree, I ran into following limitations:ġ A lot of characters have just one parent defined. The novel has many partnerships that are not technically husband/wife, but I kept the dataset simpler and made it this way. We can choose to store them, to make our querying easier, or not. Other relationships can be derived from here. I kept the relationships down to ‘father of’, ‘mother of’,’husband of’,’wife of’ to begin with. I insert into these tables the characters from Hawaii and how they are related. (ĪS NODE ON TEXTIMAGE_ON ĪS EDGE ON TEXTIMAGE_ON And it has a remark that tells us how these two people are related. It has a from_id and a to_id from the PersonNode table. Next, I have a second table, which I call PersonRelatedTo. In this I store all details regarding the individual – name, sex, ethnicity(blended if person is product of an interracial partnership). I just have one table, which I call a PersonNode. But, the way we store data and our queries can be a lot simpler in graph model. Now, SQL server not being a full fledged graph implementation – would still mandate the use of recursion to get our results. And then you’d have to write a bunch of join based queries and use recursive CTEs to get the results we want. You would need multiple tables for relationships such as children, partners, parents and so on. ![]() If you were to design it in the relational world, you would need some kind of hierarchy to indicate the level the person is. So I need a people table, which is my ‘node’ table, and I need a relationship table, which is my ‘edge’ table. If we look at these questions, my queries are based on people, and their relationships. Graph data is designed strongly based on what kind of querying you want to do. To keep things simpler, I came up with what I would like to query on from such a database, if it existed:ġ What is the family tree starting from Person A?Ģ What is the mix of ethnicity this person has?ģ How are person A and person B connected, or if they are connected?Ĥ Who is the oldest traceable ancestor of person A ?ĥ How many first/second/third cousins does this person have? There is polygamy, polyandry and all kinds of weird relationships from that time. The other ethnic groups intermarry, a lot. Out of these, the Japanese remain their own group, largely. It has primarily people from 4 ethnic groups: Caucasian North Americans, Japanese, Chinese and Native Hawaiian. The novel spans a huge historic period, from 724 AD into around 1937. I decided to use the characters in Hawaii as my test data to understand how to query geneological data, stored in graph database format. It is based on history and evolution of the Hawaiian islands, and has a rather complex network of characters, with many ethnicities and several interwoven relationships. I am a big fan of Michener’s novels, and the most recent one I have read is ‘Hawaii’. ![]() My friend Buck Woody made a tweet-remark that it would need a graph database to keep track of the characters and relationships in some of Michener’s novels. Sometime ago – we were discussing novels by famed novelist James Michener. I was looking into some data to play with in this regard. But conceptually, this can apply to many similar tree/hierarchy type structures. A lot of us may not have work related applications that are genealogy related, necessarily. Genealogy, or querying family trees is an important graph data application. I have been working a lot of SQL Graph related queries and applications of the graph data concept to the extent possible within SQL Server’s graph capabilities.
0 Comments
Leave a Reply. |