|
Text Analysis Info - Information retrieval software |
|
|
Last update: 30. June 2005
Programs listed here can be divided into more subtle groups:
|
AntConc 3.0.1 |
|
|
program: AntConc 3.0.1
author: Lawrence
Anthony
distributor: Linguist's Software
documentation: readme file for usage
download: free
version
operating system: MS-Windows, Mac-OS
description:
|
AnyText |
|
|
program: AnyText
author: Linguist's Software
distributor: Linguist's Software
documentation: no
download: no
operating system: Mac-OS System 7.1-9.2, or the Classic system in OS X. (You
must be able to boot into Classic to install.) 2 MB of RAM.
description: AnyText is a HyperCard®-based Full
Proximity Boolean Search Engine and Index Generator that allows you to create
concordances and do FAST word searches on ordinary text files in English, Greek
and Russian languages. AnyText was designed
especially to work with the Greek, English, Cyrillic and Latin Bible texts, but
can be used with any text-only file. The text files can be on diskettes, hard
disk drives or CD-ROM drives, as long as there is disk space for the special
indexing files that AnyText must create and access
for operation. Requires 2 Megabytes of RAM.
|
Ask Sam 6.0 |
|
|
program: Ask Sam 6.0
author: Ask Sam Software Development
distributor: Ask Sam Software Development
documentation: overview and quick
tour
download: trial version
operating system: MS-Windows
description: AskSam is a fast information retrieval
program and allows searching in E-mails and PDF-files. The new professional
version allows programming (e.g. with Visual Basic).
|
ATA - Ashton Text Analyser (WinATA Mark 2) |
|
|
program: ATA - Ashton Text Analyser
author and distributor: Peter Roe
documentation: users's guide
download: no, but it is free for non-commercial applications
operating system(s): Win9x, WinNT
description: ATA generates word lists, KWIC, KWOC
|
Concordance 3.2 |
|
|
program: Concordance
3.2
author and distributor: Rob
J C Watt
documentation: manual
download: trial
operating system(s): Win9x, WinNT, WinXP
description:
phrases, proximity search, samples, regular expression search, references
book-like indexing, treat upper and lower case separately, show duplicate words
separately, analyse characters instead of words, It
can also handle East Asian languages (e.g. Chinese).
sort headwords by order of occurrence, sort word
endings using a string sort, sort contexts by string before and string after
headword
language support including East Asian languages on Windows 2000/XP
user-definable HTML entity translation
|
DBT 3.1 - Data base testuale |
|
|
The website provides information in Italian only, English pages are under construction.
program: DBT 3.1 - overview,
under construction
author: Eugenio Picchi
distributor: none
documentation: none
download: demo with differents texts
operating system: Win9x, WinNT
description: DBT can do word searches, concordances, search for word sets with boolean logic (including wildcards and fuzzy search) search
in the main text or in accessory components (notes, apparatus, appendices).
Also possible: word lists in different sort orders, index locorum,
list of all verses, rhyming dictionary, list of most frequently character
sequences and word sequences, handling of images, which can be associated to
every part of the text.
|
Eric Johnson's programs |
|
|
program: Eric Johnson's programs
author and sitributor: Eric Johnson
documentation: none
download: none
operating system(s): DOS
description: Eric Johnson's programs are especially written for the analysis of
plays and poetry. Some of the program require SGML
tagged texts, some are limited to certain text corpora (e.g. Jane Austen or
Shakespeare).
ACTORS: list of characters on the stage simultaneously - generated each time
there is an entrance or exit, co-occurences of
characters on stage, list of possible doubling of roles.
BITZER generates an index of page numbers (or line numbers) for all words.
number of words within quotation marks
FINDLIST: comparision of word lists (more than two)
IDENT compares the number and percentage of occurrence of selected words in two
text files.
PICKWICK: filter program for scenes or places of a play in tagged texts.
SENT: statistics and graphics on sentence length (in
strings).
SHAKWORD: cross-references for selected in tagged texts.
WORDS: wordlists, counts types and tokens
Four programs that process the Oxford Electronic Text
Library Edition of the Complete Works of Jane Austen.
|
|
|
|
program: Kura 1.0
author: Boudewijn
Rempt
distributor: Boudewijn
Rempt
documentation: manual
download: open source
operating system(s): Win9x, WinNT, Unix/X11, MaxOS,
Linux
description:
The application consists of three independent parts: an application framework
that works with the data from the database, and a gui-framework
that the user to work with the data, and a server that
can present the data in html form over the Internet.
Requirements
|
LEXA 7.0 - Corpus Processing Software |
|
|
program: LEXA 7.0
author: Raymond Hickey
University of Essen/Germany
distributor: University of Bergen,
Norway
documentation: documentation quite
like a manual
download: test
operating system(s): DOS
description: LEXA is an open system based on files. It can perform lemmatisation, word lists, lexical density tables, file comparision, global find and replace, database and corpus
management functions (print, sort), statistics on characters, words, and
sentences, searching groups of files looking for strings, also with wildcards *
and ?, also in databases (DBF-files). There are also lots of DOS-utilites.
|
Metamorph |
|
|
program: Metamorph
distributor: Thunderstone Software
documentation: manual
download: none
operating systems: DOS, 0Win9x, WinNT, Unix
description: Metamorph is a realtime
concept based search package. It will search through anything without any
pre-processing steps. Metamorph has an English
language vocabulary of 250,000 word and phrase concept associations for natural
language queries, also boolean
logic (with weights), and wildcards can be used. It also provides proximity
control, fuzzy searches, true regular expression matching, and numerical value
searches.
The Metamorph API alone is available for most
operating systems.
|
Microconcord |
|
|
program: MicroConcord
author: Mike Scott, Tim
Johns
distributor: Mike Scott
documentation: none
download: freeware
operating system(s): DOS
description: MicroConcord is the predecessor of WordSmith. It is faster than Windows but the number of concordance
lines is limited to around 1,500, and you can't save a concordance except as a
text file.
|
MicroOCP - |
|
|
|
MonoConc Pro 2.0 |
|
|
program: MonoConc 2.0
author: Michael Barlow
distributor: Athelstan
documentation: unknown
download: demo limited to 20 hits
operating system(s): Win9x
description: MonoConc is a concordancer.
It can create concordances, word lists, (with exclusion lists, case
sensitive/insensitive), converts texts, and works with tagged texts and with
different languages. Searching can be done with wildcard characters and
variable (multi-line) context (also a sentence). Sorting to words left and
right, collocation of words is possible, too.
|
MonoConc Pro 2.0 |
|
|
program: MonoConc 2.0
author: Michael Barlow
distributor: Athelstan
documentation: unknown
download: demo limited to 20
hits
operating system(s): Win9x
description: MonoConc is a concordancer.
It can create concordances, word lists, (with exclusion lists, case
sensitive/insensitive), converts texts, and works with tagged texts and with
different languages. Searching can be done with wildcard characters and
variable (multi-line) context (also a sentence). Sorting to words left and
right, collocation of words is possible, too.
|
Phrase Context 1.02 |
|
|
program: Phrase Context
author/distributor: Hans J. Klarskov Mortensen
download: test version
documentation: none
operating systems: Windows ?
description: Phrase Context is a versatile program
that counts words and phrases, does concordances, calculates TTR-and lexical
density values, regular expressions as search patterns, and writes XML formatted
output files. The program is still in beta status.
|
Sonar 6.0 (Windows) / 12 (MacOS) Text Retrieval/Document Management Systems |
|
|
program: Sonar 6.0
distributor: Virginiasystems
download: demo
documentation: none
operating systems: Win9x, WinNT, MacOS
description: High speed program than can process many types of text and word
processing files.
|
TACT 2.1.5 - Text Analysis Computing Tools |
|
|
There is no
information available on the web, but you can still download the program. program: TACT 2.1.5
authors: Michael Stairs,
John Bradley, Ian Lancashire, Lidio Presutti
download: free
for research and teaching
operating system(s): DOS
description: TACT is a system of 15 programs and designed to do text-retrieval
and analysis on literary works. Typically, researchers use TACT to retrieve
occurrences of a word, word pattern, or word combination. Output takes the form
of a concordance, a list, or a table. Programs also can do simple kinds of
analysis, such as sorted frequencies of letters, words or phrases, type-token
statistics, or ranking of collocates to a word by their strength of
association.
TACT is intended for individual literary texts, or small to mid-size groups of
such texts. Languages using a roman alphabet and classical Greek are supported.
There is also a mailing list for TACT-users.
|
Textalyzer |
|
|
program: Textalyzer
author: Bernhard Huber
distributor: none
documentation: self explaining download: none
operating system: runs on a web site
description: Textalyser is a free text analysis tool that counts words,
sentences, syllables, and lexical density. It also computes the Gunning
readability index. A small but nice tool that counts
syllables correct at least for English, French, and German. You can cut
and paste text or and specify a web page.
|
Textstat 2.1 |
|
|
program: Textstat 2.1
author: Matthias
Hüning
distributor: Matthias
Hüning
documentation: manual
download: freeware
operating system: Windows
description: TextSTAT is a simple programme for the
analysis of texts. It reads ASCII/ANSI texts and HTML files (directly from the
internet) and it produces word frequency lists and concordances from these
files. The programme runs on MS Windows and is distributed as freeware. Source codein Python is also available for free. User interface in German (default), English, and French.
|
WordSmith 4.0 |
|
|
program: WordSmith
4.0
author: Mike Scott
distributor: Mike
Scott, Liverpool University
documentation: manual
download: beta
version is still free
operating system: Win9x, WinNT
description: WordSmith is the sucessor
of MicroConcord.
commercial: | free
Many packages above offer free or limited trial versions.