diff options
Diffstat (limited to 'lib/tabula/README.html')
| -rw-r--r-- | lib/tabula/README.html | 177 |
1 files changed, 0 insertions, 177 deletions
diff --git a/lib/tabula/README.html b/lib/tabula/README.html deleted file mode 100644 index ca10e819..00000000 --- a/lib/tabula/README.html +++ /dev/null @@ -1,177 +0,0 @@ -<!DOCTYPE html> -<html> - <head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> - <meta name="viewport" content="width=device-width, initial-scale=1" /> - <meta name="theme-color" content="#375EAB" /> - - <title></title> - </head> - <body> - <div class="topbar"> - <div class="container"> - <div class="top-heading"> - <a href="/">github.com/shuLhan/share</a> - </div> - <div class="menu"> - <a href="https://godoc.org/github.com/shuLhan/share">GoDoc</a> - </div> - <div class="menu"> - <a href="/CHANGELOG.html">Changelog</a> - </div> - </div> - </div> - - <div class="page"> - <div class="container"> - <h1></h1> - <p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula"><img src="https://godoc.org/github.com/shuLhan/share/lib/tabula?status.svg" alt="GoDoc"></a> -<a href="https://goreportcard.com/report/github.com/shuLhan/share/lib/tabula"><img src="https://goreportcard.com/badge/github.com/shuLhan/share/lib/tabula" alt="Go Report Card"></a> -<img src="https://cover.run/go/github.com/shuLhan/share/lib/tabula.svg" alt="cover.run go"></p> -<p>Package tabula is a Go library for working with rows, columns, or matrix -(table), or in another terms working with data set.</p> -<h1>Overview</h1> -<p>Go's slice gave a flexible way to manage sequence of data in one type, but what -if you want to manage a sequence of value but with different type of data? -Or manage a bunch of values like a table?</p> -<p>You can use this library to manage sequence of value with different type -and manage data in two dimensional tuple.</p> -<h2>Terminology</h2> -<p>Here are some terminologies that we used in developing this library, which may -help reader understand the internal and API.</p> -<p>Record is a single cell in row or column, or the smallest building block of -dataset.</p> -<p>Row is a horizontal representation of records in dataset.</p> -<p>Column is a vertical representation of records in dataset. -Each column has a unique name and has the same type data.</p> -<p>Dataset is a collection of rows and columns.</p> -<p>Given those definitions we can draw the representation of rows, columns, or -matrix:</p> -<pre><code> COL-0 COL-1 ... COL-x -ROW-0: record record ... record -ROW-1: record record ... record -... -ROW-y: record record ... record -</code></pre> -<h2>What make this package different from other dataset packages?</h2> -<h3>Record Type</h3> -<p>There are only three valid type in record: int64, float64, and string.</p> -<p>Each record is a pointer to interface value. Which means,</p> -<ul> -<li>Switching between rows to columns mode, or vice versa, is only a matter of -pointer switching, no memory relocations.</li> -<li>When using matrix mode, additional memory is used only to allocate slice, the -record in each rows and columns is shared.</li> -</ul> -<h3>Dataset Mode</h3> -<p>Tabula has three mode for dataset: rows, columns, or matrix.</p> -<p>For example, given a table of data,</p> -<pre><code>col1,col2,col3 -a,b,c -1,2,3 -</code></pre> -<ul> -<li> -<p>When in "rows" mode, each line is saved in its own slice, resulting in Rows:</p> -<pre><code>Rows[0]: [a b c] -Rows[1]: [1 2 3] -</code></pre> -<p>Columns is used only to save record metadata: column name, type, flag and -value space.</p> -</li> -<li> -<p>When in "columns" mode, each line saved in columns, resulting in Columns:</p> -<pre><code>Columns[0]: {col1 0 0 [] [a 1]} -Columns[1]: {col2 0 0 [] [b 2]} -Columns[1]: {col3 0 0 [] [c 3]} -</code></pre> -<p>Each column will contain metadata including column name, type, flag, and -value space (all possible value that <em>may</em> contain in column value).</p> -<p>Rows in "columns" mode is empty.</p> -</li> -<li> -<p>When in "matrix" mode, each record is saved both in row and column using -shared pointer to record.</p> -<p>Matrix mode consume more memory by allocating two slice in rows and columns, -but give flexible way to manage records.</p> -</li> -</ul> -<h2>Features</h2> -<ul> -<li> -<p><strong>Switching between rows and columns mode</strong>.</p> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#RandomPickRows"><strong>Random pick rows with or without replacement</strong></a>.</p> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#RandomPickColumns"><strong>Random pick columns with or without replacement</strong></a>.</p> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#SelectColumnsByIdx"><strong>Select column from dataset by index</strong></a>.</p> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#SortColumnsByIndex"><strong>Sort columns by index</strong></a>, -or indirect sort.</p> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#SplitRowsByNumeric"><strong>Split rows value by numeric</strong></a>. -For example, given two numeric rows,</p> -<pre><code>A: {1,2,3,4} -B: {5,6,7,8} -</code></pre> -<p>if we split row by value 7, the data will splitted into left set</p> -<pre><code>A': {1,2} -B': {5,6} -</code></pre> -<p>and the right set would be</p> -<pre><code>A'': {3,4} -B'': {7,8} -</code></pre> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#SplitRowsByCategorical"><strong>Split rows by string</strong></a>. -For example, given two rows,</p> -<pre><code>X: [A,B,A,B,C,D,C,D] -Y: [1,2,3,4,5,6,7,8] -</code></pre> -<p>if we split the rows with value set <code>[A,C]</code>, the data will splitted into left -set which contain all rows that have A or C,</p> -<pre><code> X': [A,A,C,C] - Y': [1,3,5,7] -</code></pre> -<p>and the right set, excluded set, will contain all rows which is not A or C,</p> -<pre><code> X'': [B,B,D,D] - Y'': [2,4,6,8] -</code></pre> -</li> -<li> -<p><a href="https://godoc.org/github.com/shuLhan/share/lib/tabula#SelectRowsWhere"><strong>Select row where</strong></a>. -Select row at column index x where their value is equal to y (an analogy to -<em>select where</em> in SQL). -For example, given a rows of dataset,</p> -<pre><code>ROW-1: {1,A} -ROW-2: {2,B} -ROW-3: {3,A} -ROW-4: {4,C} -</code></pre> -<p>we can select row where the second column contain 'A', which result in,</p> -<pre><code>ROW-1: {1,A} -ROW-3: {3,A} -</code></pre> -</li> -</ul> - - </div> - - </div> - - - <div class="footer"> - Copyright 2019, Shulhan <ms@kilabit.info>. All rights reserved. - <br /> - Use of this source code is governed by a BSD-style license that can be - found in the <a href="/LICENSE">LICENSE</a> file. - </div> - </body> -</html> |
