pyMez Format Handling¶

This notebook documents some of the formats currently handled by the pyMez (backend library) and progress to integrate formats for data curration and analysis. The plan to promote interoperability is to create transformations linking these formats, all formats are imported using (10/2016)

import pyMez

or

from pyMez import *

to prevent the loading of the full pyMez application programming interface (API) add the folder pyMez to the sys.path variable and use

from Code.DataHandlers.<*Models> import <Model Name>
# example
from Code.DataHandlers.TouchstoneModels import SNP

currently all data handlers have a shared signature class(file_path=None,**options) where options vary from class to class.

Future Formats

back to top

Historical Sparameter Power

All of the classes to handle these formats can be found at pyMez.Code.DataHandlers.NISTModels

Raw Ascii Formats

pyMez:class:OnePortRawModel
pyMez:class:TwoPortRawModel
pyMez:class:TwoPortNRRawModel
pyMez:class:PowerRawModel
Conversion to a CSV table and inclusion of SAS database

Sparameter_Power_Data_Transformation_20160502_001.html

Data Formats Processed With Calrep

pyMez:class:OnePortCalrepModel
pyMez:class:OnePortDUTModel
pyMez:class:TwoPortCalrepModel
pyMez:class:PowerCalrepModel
Conversion to a CSV table and inclusion of SAS database

Sparameter_Power_Data_Transformation_20160502_001.html

Other formats

.res files

</ol>

back to top

Materials and On-Wafer

These classes deal with file formats that materials and on-wafer experiments have developed in parallel. The classes are spread out during development into pyMez.Code.DataHandlers.NISTModels, pyMez.Code.DataHandlers.TouchstoneModels and pyMez.Code.DataHandlers.RadiCALModels but may later be combined. Currently there are 3 basic formats, the ascii table format used to store data from the experiment, that data converted into s2p and the output from the radical program.

Raw Ascii Formated Data
Radical Data
s2p see Touchstone

back to top

StatistiCAL

StatistiCAL is a program to create calibrations and uncertainties for calibrations for two-port measurements. pyMez has both a COM (Common Object Model) wrapper for the program and a series of classes to deal with the files that statistiCAL requires to run and outputs

The StatistiCAL wrapper
pyMez:class:StatistiCALMenuModel
pyMez:class:TwelveTermErrorModel
pyMez:class:StatistiCALSolutionModel
The four port error adapter can be opened and used for correction using pyMez:class:SNP, by acessing the attribute sparameter_complex see Touchstone

back to top

MUF

The Microwave Uncertainty Framework is a program written by Dylan Williams to create Monte Carlo based uncertainites on VNA based measurements. It is written using VB.NET and can be accessed at a base level by using the package pythonnet, in addition it has several data formats that can be seperated by type. The first type are XML based menu formats that populate the GUI used to manipulate them (.vnauncert,.meas for example) the next type are ascii formats with extensions to denote the type of information they hold (.eps,.iso,.switch) and finally the other types are of the touchstone family (Touchstone </li>) Currently the DataHandlers are being written to match the logic and meaning of the file formats, but all the formats can be acessed through the base classes.

XML Based Menus
Ascii Based Formats
SNP file types (s2p,s4p)

back to top

Touchstone

Touchstone is a series of formats for saving s-parameters and related data for network analyzer measurements. It is an ascii based format that can have several extensions associated with it. For a number of given number of ports it may have the file extension .snp where n is the number of ports. The most common extension is s2p, but other port numbers can exist and .ts can also represent a touchstone file of unknown port number. There are 2 versions of touchstone files, however version 1 is much more prevelent and currently all support in pyMez is based on version 1. All models that handle touchstone files can be found in pyMez.Code.DataHandlers.TouchstoneModels

General SNP files (ports 1-100)
The special case S2P
The special case S1PV1

back to top

XML Models

One of the primary ways to create a project file with a very portable GUI is to create an XML file and use XSLT to transform the data to a HTML page. This allows a lot of flexibility and a nice way to integrate it into CALNET as such there are XML models that represent Logs, InstrumentStates, DataTables, and a whole host of other files following specific patterns. Currently the test website displays data as interactive by reading it using the appropriate model, transforming it to a similar XML model, then transforming it to HTML using a XLST transform. All XMLModels are currently found in the module pyMez.Code.DataHandlers.XMLModels and the XSL (style sheets that transform the XML) can be found in folder pyMez/Code/DataHandlers/XSL

pyMez:class:XMLBase

pyMez:class:XMLLog

pyMez:class:DataTable

pyMez:class:FileRegister

pyMez:class:Metadata

pyMez:class:InstrumentSheet

pyMez:class:InstrumentState

back to top

Project Models

One of the most important themes in both pyMez and Calnet is the creation and management of "projects" or collections of files with a description of that collection. Most programs do this implicitly, but it is our goal to make this explicit so that we can exchange data between users and programs effectively. There are several strategies for creating projects the ones that pyMez will focus on are:

Arbitrary Database Based Projects

ZIP based projects

XML Based Projects

Binary Projects

back to top

General Ascii

One of the most popular ways to store data is to create a Ascii file with a header followed by a set of columns and potentially a footer. This general data pattern along with options for delimters and other seperators is found in pyMez.Code.DataHandlers.GeneralModels. The primary class is pyMez:class:AsciiDataTable and gives the user the ability to save an ad-hoc schema by pickling (python specific saving) the options. Most of the classes for sparameter/power are derived from this class. Touchstone models have a slightly different format (they can save the data for a single frequency in multiple rows or have a different type of data present) so they inherit from a different base class. The AsciiDataTable is fairly general and can handle different data types, headers with different structures and changing units along with saving in different formats and retrieving and printing different logical units. This class needs to be updated with more ways to save the schema and more robust error handling. In addition an algorithm to guess at the format would be very useful. The rectangular portion of this object (data attribute) can easily be converted to many different formats (excell, csv, matlab, hd5) however the header must have a structure specified to be parsed as anything other than text.

back to top

Django Models

Django is the python web framework of choice for CALNET and the Checkstandard database. It uses pyMez to analyze and track data. A django model is a specific data model that is directly cast into an SQL complient database such as Sqlite or MySQL. The models therefore shadow SQL column modelling and have attributes that are columns with the type specified by the class definition. Certain models are converted directly to these types and then stored inside of the website's database. There is a module pyMez.Code.DataHandlers.AbstractDjangoModules that stores basic patterns for reuse. Currently UserFile is the most important of these.

back to top

Matlab Models

Currently matlab uses hd5 with the extension .mat to store data (V7.3 and greater). Older versions of matlab variables can be accessed by scipy.io.loadmat() and scipy.io.savemat(), but it is my intention to only support hd5 based files to reduce the work load and circumvent a bug in the python 2.7 Anaconda distibution. The ability to translate to matlab variables will be added to promote sharing of data, in addition the binary project model supports this type of information exchange

back to top

Future

The guiding principal for pyMez will be one of data transformations and not a single data format. The transformations will follow a network approach that emphasizes formats that holds the same content as network nodes and the transformation as edges. Anytime content is changed it can be thought of as an off of graph transformation (jump). The basic data patterns that will have graphs defined are

string (already defined)
rectangular data table
data table
project
see Project Models for better description

The future set of models that need to be supported:

JSON
Images
DOM