Understanding Data Joins in D3.js - Part 2
So in my previous post Understanding Data Joins in D3.js - Part 1 we saw basics of Data joins, here we will see some more details but first a quick recap-
Quick recap -
- use data() to assign data to elements, it returns the collection of element for which corresponding data was present, called as update selection
- by default data mapping i.e assigning data to element is done based on index of data and element
- enter() returns you collection of (placeholder) elements for which data was present but no element was found in dom, and this can be used to create new elements in dom using append()
- exit() return you collection of elements for which no corresponding data was present but element was there in dom, and this can be used to remove such elements using remove()
Keeping above things in mind consider now we have data as this -
Each object in array has properties - value, color and width, we will use these to plot divs. Now say initially we dont have any divs present in body -
And we want to div for each object in array, that should be easy right?, we know we can use enter() function for cases where we dont have corresponding elements and append those, lets do that -
Working example -
This works as expected, but we haven’t used the data from object for text and stylling div. Lets do that -
Here if you observe we have used text() and style() functions for text and stylling of divs, and to which we pass functions to extract value from object.
Here in ‘d’ in function parameter is complete object corresponding to the element, and the we use that object to extract what we want from it.
Lets see this in action -
Note - We could have combined data & enter statements like -
and still this would have worked but as a practice always keep the reference of returning value of data() and enter() like we did above using ‘updateDivSelection’ and ‘enterDivSelection.
Lets say we are using some real time data and our dataArray keeps updating after 1 sec and you want these changes should be reflected on the Dom, by update we mean either of these can happen -
-
Case 1
New object can be added to array
-
Case 2
Object can be removed from array
-
Case 3
For existing objects in array, only width can be updated
Then how to reflect this changes on dom, for this there can be two approaches -
- Simple one would be to remove all the divs and re-render using new updated data
- Execute update cycle (i.e change of width) only on existing data elements, execute append cycle for newly added data and remove cycle div for removed data.
First approach is simple and straight forward, but its costly & not performace friendly if you have large amount of data and complex UI so should not be used.
Second approach a bit tricky to understand, and does not always guarantee to give you high performance in each case compared to first approach, this approach helps to optimise in case where you are just updating few properties of existing data object, like we are just updating ‘width’ here
But if you are changing each value or most of the values of object in data then this will similar as if you are rendering elements again and therby very similar to first approach. (Still only difference here will be you will using same divs again rather than creating new as in first approach)
Or in case where you are removing almost all previous object and adding new object to array, then again it becomes almost similar to first approach, so you will not see a tremendous performace change in these cases.
But then how to implement Second approach??, remember data() gives us updated element collection, enter() gives us new element (placeholder) collection and exit() gives us removed element collection, by using these we can achieve that.
But now you will ask aren’t we using it already? Yes we are, but we need to change our writing way a bit to achieve optimisation.
We will update dataArray after 1 sec and then will call render function again. Wait a second, calling render function again??? doesn’t this sound like First approach, no we are not removing previous divs before calling render again. This is where beauty lies, d3 will execute update cycle on existing element only rather that removing and recreating all divs again.
Lets see how
Case 1 -
Lets just add new object to dataArray and call the render again -
This was easy, we kept all of our previous code as it is, and 5th div was added. Re-run fiddle if you missed to see update.
Case 2 -
Lets just remove last object from dataArray and call the render again, but only this will be sufficient?? no, we haven’t used exit() function to remove extra div, we will do it as -
Working example -
This was also easy and simple, now lets move on to last case of updating width.
Case 3 -
Lets just update width of first object of dataArray and call render again, will this do what we want?? no, why because we are stylling div on enter() function not on data() function-
so even if data() get some updates for div it will not reflect, what to do then? should we move style function call from enter to data ? then what will happen to newly added divs, they will not have width and background-color, initially, you can try it out. So should we have style() for both enter and data ?, lets just do that and see if it works -
Working example -
This works well, but we had to repeat same code twice, thats not good how to address that??
D3.js gives you one more way to deal with it, if you read enter details, it reads -
The enter selection merges into the update selection when you append or insert.
That means after calling enter(), the data selection i.e updateDivSelection for us above will hold references for both updated and newly added div, and you can use this feature, so that you don’t have to repeat code again, lets see how we do that -
Working example -
This gives us idea how we should write our code using d3.js, always use update selection after enter() for properties which can change.
Now we will see last part of the data join series i.e key functions, now suppose say I remove first object of dataArray by using shift() function and will call render again, what ideally should happen is div with text ‘one’ should disappear, right? lets try that -
But here we see that instead of div with text ‘one’, div with text ‘four’ was removed, why so? As we discussed in first part data join by default happens on basis of index, because of that, this thing is happening. Let dig in a bit -
Initially there were no divs so join would be like -
Index | Dom Element | Array Element |
---|---|---|
1 | (Placeholder First div) | {value : 'one', color:'red', width: 100} |
2 | (Placeholder Second div) | {value: 'two', color:'green', width: 200}, |
3 | (Placeholder Third div) | {value: 'three', color:'blue', width: 150}, |
4 | (Placeholder Fourth div) | {value: 'four', color:'yellow', width: 300} |
After we append the divs using enter() then join would be on index based and will look like
Index | Dom Element | Array Element |
---|---|---|
1 | <div> one </div> | {value : 'one', color:'red', width: 100} |
2 | <div> two </div> | {value: 'two', color:'green', width: 200} |
3 | <div> three </div> | {value: 'three', color:'blue', width: 150} |
4 | <div> four </div> | {value: 'four', color:'yellow', width: 300} |
Then after 1 sec. we remove first object from dataArray using shift and it looks like -
After calling render again data would be joined again, based on index and will look like -
Index | Dom Element | Array Element |
---|---|---|
1 | <div> one </div> | {value: 'two', color:'green', width: 200} |
2 | <div> two </div> | {value: 'three', color:'blue', width: 150} |
3 | <div> three </div> | {value: 'four', color:'yellow', width: 300} |
4 | <div> four </div> |
So upadate selection would of size 3 and enter selection of size 0, and after enter selection we are just updating width and color, no text change we are doing, that why text remains same, and exit selection of size 1 and fourth div gets removed rather than first.
Then how to fix this??, this all is happening only beacause of join is happening based on index rather than some other key, to do that we use Key Function as-
Here we are using value as joining constraint, so that whenever next time join happens it should be based on value, if only value matches then only join else don’t, this is very similar to one-to-one table join in relational DB based on some constraint (join table_one with table_two where table_one.key1 == table_two.key2).
So after this, join would look like -
Index | Dom Element | Array Element |
---|---|---|
1 | <div> one </div> | |
2 | <div> two </div> | {value: 'two', color:'green', width: 200} |
3 | <div> three </div> | {value: 'three', color:'blue', width: 150} |
4 | <div> four </div> | {value: 'four', color:'yellow', width: 300} |
Testing this out -
So this works as expected, in practice if you have data that always updates use key function to avoid such things.
Now you must have got idea how to write chart which gets real time data and you need to reflect it on chart.
This is all for Data Joins in d3.js, hope it helped you to understand it.