Dao De Data, Chapter 1
Critically assess your biases.
Publish this analysis biases alongside your data insights.
Consider who might want to opt out of a given data application.
Strive to empathize with users that resist features and aim to understand why.
...the Tao is found where we would least expect it--not in the strong but in the weak; not in speech but in silence; not in doing but in "not-doing."
My shower thoughts this morning were on the similarities between ethical data science and the ancient Chinese philosophy of Tao (or Dao). Having zero Chinese heritage in my family and never having read the Tao Te Ching (or Dao De Jing) itself, these thoughts were exclusively speculative. But I was pretty sure there were analogies to explore nonetheless. Thankfully, the text turned out to be manageable in length, with English versions seeming to fall in the 75-150 page range. It can be read cover-to-cover in about 2 hours, but it is also broken into 81 chapters that seem to read like daily reflections. So I'd like to read through the Dao De Jing slowly and reflect on Daoist analogies for Data Science. If it takes me a month per chapter, so be it.
Many English translations are available, filtered through varying degrees of philosophical interpretation. For example, the World I-Kuan Tao Headquarters presents chapter 1 as the following:
Tao (The Way) that can be spoken of is not the Constant Tao. The name that can be named is not a Constant Name.
Nameless, is the origin of Heaven and Earth;
The named is the Mother of all things.
Thus, the constant void enables one to observe the true essence. The constant being enables one to see the outward manifestations.
These two come paired from the same origin.
But when the essence is manifested,
It has a different name.
This same origin is called “The Profound Mystery.”
As profound the mystery as It can be,
It is the Gate to the essence of all life.
The above version falls on the side of faithful non-dilution. However, it can be difficult to follow due to the translators' non-native grasp of the English language. James Legge produced this version in 1891:
1. The Tao that can be trodden is not the enduring and unchanging Tao. The name that can be named is not the enduring and unchanging name.
2. (Conceived of as) having no name, it is the Originator of heaven and earth; (conceived of as) having a name, it is the Mother of all things.
3. Always without desire we must be found,
If its deep mystery we would sound;
But if desire always within us be,
Its outer fringe is all that we shall see.
4. Under these two aspects, it is really the same; but as development takes place, it receives the different names. Together we call them the Mystery. Where the Mystery is the deepest is the gate of all that is subtle and wonderful.
Legge's translation is more comprehensible as English and, with the introduction of rhymed couplets, reads much like a poem. However, I find it to fall short of timeless. Stephen Mitchell's version, first published in 1988, obviously uses a more modern voice. What I like most about it is that his word choice attempts to convey the ideals and virtues intended rather than going for a faithfully literal translation.
The tao that can be told
is not the eternal Tao.
The name that can be named
is not the eternal Name.
The unnamable is the eternally real.
Naming is the origin
of all particular things.
Free from desire, you realize the mystery.
Caught in desire, you see only the manifestations.
Yet mystery and manifestations
arise from the same source.
This source is called darkness.
Darkness within darkness.
The gateway to all understanding.
We see that each translation offers slightly different insights and there are many more beyond what I've included here. Since I am not a Daoist, my goal is not to determine which translation holds truest to the ancient text. For now, I'm working primarily from the Stephen Mitchell version because it has felt like the most likely balance between literally authentic and overly interpreted.
And my data science version of chapter 1? It relates to those decontextualized musings I used to open this post, which were the thought snippets that led me toward this Dao in the first place:
DAO DE DATA
People love say "numbers don't lie,"
but that's false.
are not boundless facts.
Context carries the validity.
Analysis and application
are inherently creative.
Absent our biases, we recognize immeasurable complexity.
Blinded by goals, we see what we want to see.
Yet prudent robustness and naive simplicity
arise from the same source.
Each comes from a search for meaning.
Analysis of analysis.
The gateway to humane insight.
The two stanzas with which I had the most trouble were the second and the fifth. If anyone has Data Science translations for those stanzas in particular, I would love to hear others' interpretations. Twitter is probably the easiest way to get at me, but there is also a contact link in the header.