Textual Data in Tableau : The Present and The Future
Data visualization in simple terms entails visual representation of data. It is the technology equivalent of visual communication. It mainly leverages statistical analysis, charts, graphs and information graphics to communicate logical quantitative insights.
With big data driving all industries towards adopting data-driven decision making, data visualization has become an invaluable part of this journey. It is instrumental in presenting business value metrics and insights to the right people at the right time. With increasing demand, there has been a rise in products that offer data visualization and business intelligence solutions.
The biggest name in today’s market when it comes to data visualization is Tableau. With over 57,000 customer accounts, it is undoubtedly a leader in its space. It is known for its simplicity, ease-of-use and interactive interface but its most important features are the ability to handle massive, dynamic datasets and the out-of-the-box integrations with some of the most widely used technologies.
One of Tableau’s biggest assets is its ability to visualize various kinds of information. Representing data visually is in huge demand and is becoming very popular owing to its practical applications. Tailoring such visualizations to various audiences is Tableau’s USP and it can be deployed at an enterprise-scale making it a weapon in any organization’s tech arsenal.
One of the most widely used data types used in Tableau is text. Tableau has a lot of functions to manipulate texts. Some of the use cases of such functions are listed below.
Encoding
ASCII
Returns the ASCII code for the first character of a string
CHAR
Returns the character encoded by the ASCII code number
Text Attributes
LEN
Returns the character length of the string
MAX
Finds the value that is highest in the sort sequence defined by the database for that column.
MID
Returns the string starting at index position start. The first character in the string is position 1. If the optional argument length is added, the returned string includes only that number of characters.
MIN
Finds the value that is lowest in the sort sequence.
Text Manipulation
LEFT
Returns the left-most number of characters in the string
RIGHT
Returns the right-most number of characters in string
LOWER
Returns a string, with all characters lowercase
LTRIM
Returns the string with any leading spaces removed
RTRIM
Returns a string with any trailing spaces removed
SPACE
Returns a string that is composed of the specified number of repeated spaces
SPLIT
Returns a substring from a string, using a delimiter character to divide the string into a sequence of tokens
TRIM
Returns the string with leading and trailing spaces removed
UPPER
Returns string, with all characters uppercase
Pattern Matching
CONTAINS
Returns true if the given string contains the specified substring
ENDSWITH
Returns true if the given string ends with the specified substring
FIND
Returns the index position of a substring in a string, or 0 if the substring isn't found
FINDNTH
Returns the position of the nth occurrence of substring within the specified string, where n is defined by the occurrence argument
REPLACE
Searches a string for a substring and replaces it with replacement
STARTSWITH
Returns true if string starts with substring. Leading white spaces are ignored
REGEXP_REPLACE
Returns a copy of the given string where the regular expression pattern is replaced by the replacement string
REGEXP_MATCH
Returns true if a substring of the specified string matches the regular expression pattern
REGEXP_EXTRACT
Returns the portion of the string that matches the regular expression pattern
REGEXP_EXTRACT_NTH
Returns the portion of the string that matches the regular expression pattern
These functions help us gain a lot of insights from textual data. Tableau’s ability to create calculated fields gives users the ability to grasp and understand hidden meaning in data. Some of the common insights that we can extract using the above-mentioned functions are as follows
A big limitation with respect to textual data is the lack of validation or verification mechanisms within Tableau. The insights we get are completely dependent on the data that is integrated with Tableau. Textual data might not make complete sense in most cases. Tableau does not have the ability or any function to clean data or interpret it into a more human-readable format. There may be instances where data is generated by machines and it might just be an alphanumeric code that has a certain meaning that industry experts can decipher. But in order to make a person who does not have such expertise understand the visualizations about such data would require conversion or interpretation of such complex data into the equivalent meaningful and human-readable form.
This is one aspect that Tableau can develop in the near future. Another domain that Tableau can leverage to develop such a capability is Natural Language Processing. There are some niche BI tools that have the capability of generating BI reports just by using search queries. The tool then interprets the query, transforms it to the appropriate dimension and measures. After this, the tool applies constraints by understanding the context of the user’s requirement and creates visualizations that service the user’s needs. Tableau can also use such NLP techniques to gauge the language of the input data and convert every record to a consistent standard. This ensures that the resulting visualizations are much more accurate and precise.
One of Tableau’s biggest assets is its ability to visualize various kinds of information. Representing data visually is in huge demand and is becoming very popular owing to its practical applications. Tailoring such visualizations to various audiences is Tableau’s USP and it can be deployed at an enterprise-scale making it a weapon in any organization’s tech arsenal.
One of the most widely used data types used in Tableau is text. Tableau has a lot of functions to manipulate texts. Some of the use cases of such functions are listed below.
Encoding
ASCII
Returns the ASCII code for the first character of a string
CHAR
Returns the character encoded by the ASCII code number
Text Attributes
LEN
Returns the character length of the string
MAX
Finds the value that is highest in the sort sequence defined by the database for that column.
MID
Returns the string starting at index position start. The first character in the string is position 1. If the optional argument length is added, the returned string includes only that number of characters.
MIN
Finds the value that is lowest in the sort sequence.
Text Manipulation
LEFT
Returns the left-most number of characters in the string
RIGHT
Returns the right-most number of characters in string
LOWER
Returns a string, with all characters lowercase
LTRIM
Returns the string with any leading spaces removed
RTRIM
Returns a string with any trailing spaces removed
SPACE
Returns a string that is composed of the specified number of repeated spaces
SPLIT
Returns a substring from a string, using a delimiter character to divide the string into a sequence of tokens
TRIM
Returns the string with leading and trailing spaces removed
UPPER
Returns string, with all characters uppercase
Pattern Matching
CONTAINS
Returns true if the given string contains the specified substring
ENDSWITH
Returns true if the given string ends with the specified substring
FIND
Returns the index position of a substring in a string, or 0 if the substring isn't found
FINDNTH
Returns the position of the nth occurrence of substring within the specified string, where n is defined by the occurrence argument
REPLACE
Searches a string for a substring and replaces it with replacement
STARTSWITH
Returns true if string starts with substring. Leading white spaces are ignored
REGEXP_REPLACE
Returns a copy of the given string where the regular expression pattern is replaced by the replacement string
REGEXP_MATCH
Returns true if a substring of the specified string matches the regular expression pattern
REGEXP_EXTRACT
Returns the portion of the string that matches the regular expression pattern
REGEXP_EXTRACT_NTH
Returns the portion of the string that matches the regular expression pattern
These functions help us gain a lot of insights from textual data. Tableau’s ability to create calculated fields gives users the ability to grasp and understand hidden meaning in data. Some of the common insights that we can extract using the above-mentioned functions are as follows
- Finding lengths of certain fields may help us define a new dimension
- Comparing sorting orders of two fields can help define new agnostic dimensions
- Extracting substrings from textual data can give us valuable information
- Pattern matching can help us define trends in the data and give hidden meaning
- Well-defined regular expressions can help us correlate similar data points
A big limitation with respect to textual data is the lack of validation or verification mechanisms within Tableau. The insights we get are completely dependent on the data that is integrated with Tableau. Textual data might not make complete sense in most cases. Tableau does not have the ability or any function to clean data or interpret it into a more human-readable format. There may be instances where data is generated by machines and it might just be an alphanumeric code that has a certain meaning that industry experts can decipher. But in order to make a person who does not have such expertise understand the visualizations about such data would require conversion or interpretation of such complex data into the equivalent meaningful and human-readable form.
This is one aspect that Tableau can develop in the near future. Another domain that Tableau can leverage to develop such a capability is Natural Language Processing. There are some niche BI tools that have the capability of generating BI reports just by using search queries. The tool then interprets the query, transforms it to the appropriate dimension and measures. After this, the tool applies constraints by understanding the context of the user’s requirement and creates visualizations that service the user’s needs. Tableau can also use such NLP techniques to gauge the language of the input data and convert every record to a consistent standard. This ensures that the resulting visualizations are much more accurate and precise.
References
https://onlinehelp.tableau.com/current/pro/desktop/en-us/functions_all_categories.html
https://onlinehelp.tableau.com/current/pro/desktop/en-us/functions_functions_string.html
https://www.forbes.com/sites/bernardmarr/2017/07/20/the-7-best-data-visualization-tools-in-2017/#4dc5a5966c30
https://onlinehelp.tableau.com/current/pro/desktop/en-us/functions_functions_string.html
https://www.forbes.com/sites/bernardmarr/2017/07/20/the-7-best-data-visualization-tools-in-2017/#4dc5a5966c30
http://bigdata-madesimple.com/top-business-intelligence-bi-tools-in-the-market/




Nice blog thank you sharing this wonderful information
ReplyDeleteTableau Online Training