This notebook documents some of the formats currently handled by the pyMez (backend library) and progress to integrate formats for data curration and analysis. The plan to promote interoperability is to create transformations linking these formats, all formats are imported using (10/2016)
import pyMez
or
from pyMez import *
to prevent the loading of the full pyMez application programming interface (API) add the folder pyMez to the sys.path variable and use
from Code.DataHandlers.<*Models> import <Model Name>
# example
from Code.DataHandlers.TouchstoneModels import SNP
currently all data handlers have a shared signature class(file_path=None,**options)
where options vary from class to class.
All of the classes to handle these formats can be found at pyMez.Code.DataHandlers.NISTModels
</ol>
These classes deal with file formats that materials and on-wafer experiments have developed in parallel. The classes are spread out during development into pyMez.Code.DataHandlers.NISTModels, pyMez.Code.DataHandlers.TouchstoneModels and pyMez.Code.DataHandlers.RadiCALModels but may later be combined. Currently there are 3 basic formats, the ascii table format used to store data from the experiment, that data converted into s2p and the output from the radical program.
StatistiCAL is a program to create calibrations and uncertainties for calibrations for two-port measurements. pyMez has both a COM (Common Object Model) wrapper for the program and a series of classes to deal with the files that statistiCAL requires to run and outputs
The Microwave Uncertainty Framework is a program written by Dylan Williams to create Monte Carlo based uncertainites on VNA based measurements. It is written using VB.NET and can be accessed at a base level by using the package pythonnet, in addition it has several data formats that can be seperated by type. The first type are XML based menu formats that populate the GUI used to manipulate them (.vnauncert,.meas for example) the next type are ascii formats with extensions to denote the type of information they hold (.eps,.iso,.switch) and finally the other types are of the touchstone family (Touchstone </li>) Currently the DataHandlers are being written to match the logic and meaning of the file formats, but all the formats can be acessed through the base classes.
Touchstone is a series of formats for saving s-parameters and related data for network analyzer measurements. It is an ascii based format that can have several extensions associated with it. For a number of given number of ports it may have the file extension .snp where n is the number of ports. The most common extension is s2p, but other port numbers can exist and .ts can also represent a touchstone file of unknown port number. There are 2 versions of touchstone files, however version 1 is much more prevelent and currently all support in pyMez is based on version 1. All models that handle touchstone files can be found in pyMez.Code.DataHandlers.TouchstoneModels
One of the primary ways to create a project file with a very portable GUI is to create an XML file and use XSLT to transform the data to a HTML page. This allows a lot of flexibility and a nice way to integrate it into CALNET as such there are XML models that represent Logs, InstrumentStates, DataTables, and a whole host of other files following specific patterns. Currently the test website displays data as interactive by reading it using the appropriate model, transforming it to a similar XML model, then transforming it to HTML using a XLST transform. All XMLModels are currently found in the module pyMez.Code.DataHandlers.XMLModels and the XSL (style sheets that transform the XML) can be found in folder pyMez/Code/DataHandlers/XSL
One of the most important themes in both pyMez and Calnet is the creation and management of "projects" or collections of files with a description of that collection. Most programs do this implicitly, but it is our goal to make this explicit so that we can exchange data between users and programs effectively. There are several strategies for creating projects the ones that pyMez will focus on are:
One of the most popular ways to store data is to create a Ascii file with a header followed by a set of columns and potentially a footer. This general data pattern along with options for delimters and other seperators is found in pyMez.Code.DataHandlers.GeneralModels. The primary class is pyMez:class:AsciiDataTable and gives the user the ability to save an ad-hoc schema by pickling (python specific saving) the options. Most of the classes for sparameter/power are derived from this class. Touchstone models have a slightly different format (they can save the data for a single frequency in multiple rows or have a different type of data present) so they inherit from a different base class. The AsciiDataTable is fairly general and can handle different data types, headers with different structures and changing units along with saving in different formats and retrieving and printing different logical units. This class needs to be updated with more ways to save the schema and more robust error handling. In addition an algorithm to guess at the format would be very useful. The rectangular portion of this object (data attribute) can easily be converted to many different formats (excell, csv, matlab, hd5) however the header must have a structure specified to be parsed as anything other than text.
Django is the python web framework of choice for CALNET and the Checkstandard database. It uses pyMez to analyze and track data. A django model is a specific data model that is directly cast into an SQL complient database such as Sqlite or MySQL. The models therefore shadow SQL column modelling and have attributes that are columns with the type specified by the class definition. Certain models are converted directly to these types and then stored inside of the website's database. There is a module pyMez.Code.DataHandlers.AbstractDjangoModules that stores basic patterns for reuse. Currently UserFile is the most important of these.
Currently matlab uses hd5 with the extension .mat to store data (V7.3 and greater). Older versions of matlab variables can be accessed by scipy.io.loadmat() and scipy.io.savemat(), but it is my intention to only support hd5 based files to reduce the work load and circumvent a bug in the python 2.7 Anaconda distibution. The ability to translate to matlab variables will be added to promote sharing of data, in addition the binary project model supports this type of information exchange
The guiding principal for pyMez will be one of data transformations and not a single data format. The transformations will follow a network approach that emphasizes formats that holds the same content as network nodes and the transformation as edges. Anytime content is changed it can be thought of as an off of graph transformation (jump). The basic data patterns that will have graphs defined are
The future set of models that need to be supported: