Analysis of Structural Properties of TFSI Anion
Tej Gumaste, Fynnegan L. Cooper, James D. Blakemore
University Of Kansas
View Graphs Visit Github Blakemore LabProject Overview
Chemical data science is an emerging field within the broader field of chemistry that has numerous applications of high relevance to a variety of academic and industrial pursuits. With quantum computing and artificial intelligence becoming more mainstream, it is of great interest for not only the data scientist, but the chemist as well, to take advantage of these technologies to streamline data processing, analyze large data sets, and reveal new chemical insights that would be otherwise hidden by the sheer amount and depth/complexity of available data. In this project, modern computing and fundamental chemical analysis have been combined in order to pursue new insights into the structural behavior of the weakly coordinating anion bis(trifluoromethylsulfonyl)imide, otherwise known as TFSI. Taking a data science approach, published solid-state crystal structures available in the Cambridge Structural Database (CSD)1,2 including one or more TFSI species of interest were categorized and statistically analyzed using software built for this purpose. The goal of this project from a chemical perspective was to determine the structural characteristics displayed by TFSI as inferred from the structural data. The goal of this project from a data-science perspective was to develop a new software program using Python to parse Crystallographic Information Files (CIF) obtained from the CSD3 into statistically relevant information that could be compared across the individual structural data sets. This research endeavor aims to highlight the applications of data science to an otherwise foreign area of research (structural chemistry) and outline the capabilities of data processing for future structural and chemical investigations.
References:
1. C. R. Groom, I. J. Bruno, M. P. Lightfoot and S. C. Ward, Acta Cryst. B, 2016, 72, 171-179.
2. The analysis reported here was carried out using results from the 2024.2 CSD release, dated July 2024. This release of the CSD contains more than 1.3 million individual structures from which our TFSI-containing data were selected. More information about the CSD can be obtained from the Cambridge Crystallographic Data Centre.
3. Cambridge Crystallographic Data Center Website