GitHub - bridget-smart/LCSFinder: A toolkit for quickly calculating longest common substrings with specific relevance to entropy estimation.

LCSFinder

A package for quickly calculating longest common substrings with a fixed starting location of one substring. Once the two strings, $s_1$ and $s_2$ are defined, this package can be used to find the length of the longest substring that starts in the range $s_2[0..j)$ and matches a prefix of the string $s_1[i..n)$. This prefix must begin at index $i$ in $s_1$ and must end before index $j$ in $s_2$. The indices $(i,j)$ are passed as a list of tuples with increasing $i,j$, allowing many of these matches to be computed at a time.

This algorithm employs properties of a sorted suffix array to allow the longest match length to be found in O(1) with O(N) precomputation.

This function is designed to be used within a modified Kontoyannis Shannon entropy estimator, to improve computational speed. This implementation is currently provided in the ProcessEntropy package.

Example Usage

# load packages
import LCSFinder as lcs
import numpy as np

# initialise strings
list_source = np.random.randint(1,10,100)
list_target = np.random.randint(1,10,100)

# set up objects
source = lcs.Vector1D([int(x) for x in ([np.floor(x) for x in test1])])
target = lcs.Vector1D([int(x) for x in ([np.floor(x) for x in test2])])
ob = lcs.LCSFinder(source,target)

# set up indices to search from
l_t =  lcs.Vector2D(tuple((i,i+1) for i in range(len(list_source))))

ob.ComputeAllLCSs(l_t)

Requirements

C++ compiler C++11 or greater
Python 3.x

Installation

pip install LCSFinder

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
LCSFinder.egg-info		LCSFinder.egg-info
__pycache__		__pycache__
build		build
dist		dist
tests		tests
LCSFinder.cpp		LCSFinder.cpp
LCSFinder.h		LCSFinder.h
LCSFinder.i		LCSFinder.i
LCSFinder.py		LCSFinder.py
LCSFinder_wrap.cxx		LCSFinder_wrap.cxx
LICENSE.md		LICENSE.md
README.md		README.md
_LCSFinder.cpython-311-darwin.so		_LCSFinder.cpython-311-darwin.so
_LCSFinder.cpython-39-darwin.so		_LCSFinder.cpython-39-darwin.so
process_ent_functs.py		process_ent_functs.py
ru.sh		ru.sh
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LCSFinder

Example Usage

Requirements

Installation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

bridget-smart/LCSFinder

Folders and files

Latest commit

History

Repository files navigation

LCSFinder

Example Usage

Requirements

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages