Room: Room 228
April 4
11:00–11:25
The data amount and the complexity of the queries are not particularly large in this industry. The challenge comes from using the STDF format, a binary file format with roots in the 1980's.
A method to make this data source available to modern data analysis tools (jupyter/streamlit) using the construct library will be discussed. The focus is on how the data can be collected, converted and made available in a fast and efficient way, using both pypy and cpython.
basic python skills. Understanding how binary file formats can be dealt with is helpfull
In the silicon production industry, there is currently the outgoing but still widely used file format STDF in use to store production test results.
This ageing file format is well established and protected by strong institutional momentum. For example, TEMS (not a file format but messages are the scope) and RITdb are contending to replace it.
This presentation is going to show how to leverage the power of construct (https://github.com/construct/construct) together with pypy and cpython to transform the STDF data to parquet to make it accessible to modern and efficient analysis methods (polars/pandas dataframes).
Using construct it is possible to copy/paste + search/replace the STDF file format specification into an implementation which can do both parsing and generating STDF files.
This is how the implementation of a segment of this looks like:
PGR_payload = construct.Struct("GRP_INDX" / construct.Int16ul * "Unique index associated with pin group",
"GRP_NAM" / construct.PascalString(construct.Byte, "ascii") * "Name of pin group length byte = 0",
This implementation allows us to easily create.:
Leveraging the pypy just-in-time compiler, in our environment the bottleneck is network throughput.
Old binary file formats which are specified by tables are easily accessible to modern methods using construct.
Career.: