HomeBlogsAncelus Database Blog

High Performance Blog

Updates on the world of high performance organizations and the information they use.

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Team Blogs
    Team Blogs Find your favorite team blogs here.
  • Login

Posted by on in Database

The HDF5 format has become a de facto standard for handling scientific data.  While it has all the advantages of standardized methods, plus a recently added streaming capability, it also has some serious limitations.

Hierarchical Data Format has to date been built on the assumption of post collection analysis. Ancelus now introduces Dynamic HDF5 capability that greatly expands functionality.

1. Stream data in and query data at the same time.

2. In-stream statistical analysis at up to 1 million statistical ops per second.  Supports a two layer strategy with instant anomaly detection leading to more efficiency in the complex analytics.

3. Stream indexing completely eliminates downtime to re-index critical data tables.

...
Hits: 135
0

Posted by on in Database

There are many critical applications that require a dual commit capability with assured completion.  This involves processing a transaction to two computers at remote locations with ACID level assurance of completion at both sites.

Even though no standard has been finalized for this function, the tentative measure shows Ancelus at 200 nanoseconds plus network latency.  

This timing excludes the TCP/IP stack (or it's replacement). Since there are multiple network protocols with widely divergent performance, it is unlikely that we will see an end-to-end standard soon.

 

Hits: 119
0

Posted by on in Database

Ancelus Transaction Processing tools have completed another round of testing.  Many strong modifications came from the beta site feedback.  Ancelus now includes protections against all (we think) forms of deadlock. It also includes messaging to help troubleshoot some of the common problems in application development.

Of particular importance is the performance.  Performance loss was only 1% with real-time ACID compliance on the persistent store.

It's now back for a second round of beta evaluation.  Integration into the TQL user interface is the next priority.

Hits: 177
0

Posted by on in Database

Testing of the transaction processing (TXP) functions is nearing completion.  This set of utilities supports multi-lock for relational writes, and retains ACID compliance through real-time journal management.  With these additions Ancelus may be the easiest system to use for TXP application development.

The most important aspect of TXP is that there is no performance penalty because of the unique locking processes used by Ancelus.  Relational views can now operate at speeds previously restricted to streaming applications. 

A new part of the specification for Ancelus is the ability to start or stop the journal file in-stream.  It wasn't originally envisioned to operate this way, but in very large datasets it makes a major contribution to simpler administration and less downtime.

The final step in the TXP delivery will be to integrate it into TQL.

 

Hits: 830
0

Posted by on in Database

Yesterday we reviewed the completed Phase 1 release of the Java integration modules.  Ancelus now supports both JNI and JDBC integration.  

Both of these integrations are tightly coupled with the TQL (threaded query language) development tool unique to Ancelus.  TQL was specifically developed to retain the extreme performance of the Ancelus database.  It also has the advantage of reducing query development time from days to minutes  The point-and-click screens deliver a fully optimized query and an instant results sample. It prevents most common sources of error and ambiguity inherent in SQL and other relational-centric query tools.  For example, it eliminates any need for developer knowledge of database internals or table structure.

We're ready to go to beta testing with the Java tools, but haven't resolved a site.  Most Ancelus customers develop applications in php or C, including our own RT3 and A3 applications.  

If you would like to participate in this testing program, give us a call. 

Hits: 792
0
Posted by on in Database

Beta testing of the TQL editor is now in progress at a live site.  Stay tuned.

Hits: 517
0

Posted by on in Database

A cautionary note.  AncelusDB is not supported on CentOS 6.7.  This OS has a bug that corrupts the memory map and is guaranteed to cause the application to fail unpredictably.  

Users of 6.7 should immediately upgrade to CentOS 6.8 or higher.

Hits: 904
0

Posted by on in Database

The barriers have finally been broken down.  Primary architecture is defined and prototype is functional.  We had gotten distracted by an attempt to keep the query language completely independent of the physical data store. This is the foundation assumption of SQL.  Once we let go of that it didn't take long.

Since more than 90% of queries do not require that degree of independence, a new data store concept can get us out of the trap of being held hostage to the needs of the 10%.  And no performance penalties.  The 10% of queries that are really "ad-hoc" (not defined by the application) will be slower than the 90% cases.  But still faster than the all systems based on the relational assumption.

The packaging and clean up to get this to beta level will take several months, but it now looks like we're on the right track.  

Hits: 906
0

Posted by on in Database

Our development work on TQL continues but not without some setbacks.  Several strategies have been abandoned in the past 18 months, and the idea of keeping it an arms-length toolset (independent of the database) is now on the shelf.  

TQL development will now move down a new path, but with less disclosure.  We think we're on a track that will result in a new operating model.

Watch this space.  We'll let you know as soon as we can.

Hits: 908
0

Posted by on in Database

A common objection heard from IT organizations when discussing Ancelus is "Our database and technology stack is fast enough."  

So how fast is "fast enough"?  This usually means the user response time doesn't generate too many complaints.  But that's a red herring.

When it comes to database performance Speed = $$$.  Accepting a relational system as "fast enough" usually means "lots of hardware solved it."  We had one customer that reduced server count from 252 to 26.

The more precise question would seem to be "How much technology bloat would you like to fund?"

 

Hits: 963
0

Posted by on in Database

The new benchmarks are now published on the benchmarks page.

Described as "amazing" and "impressive" by industry analysts.

Hits: 940
0

Posted by on in Database

What would happen if we tested the world's fastest database on the world's fastest server?

We usually ignore the announcements of the hardware and chip manufacturers.  A debate about a 3 GHz clock vs. a 3.1 GHz clock has no significant effect on Ancelus performance. Our unique architecture eliminates most of the operating system and CPU functions.

That's why we chose to publish our benchmarks on a 1U pizza box, mail order server that cost under $8,000. 

But something happened last week that is changing our attitude.  We received a new server based on the Intel Broadwell CPU. This is the unquestioned leader in server performance.  We're now in the process of running new benchmarks. We're suddenly paying a lot of attention to this hardware.  

First results suggest that Intel has done something very fundamental.  Slower clock speed (2.2 GHz) but much higher performance.

...
Hits: 1347
0

Posted by on in Database

The NoSQL movement has received a lot of attention over the past few years.  But the reality is turning out to be different from the hype.

All of these systems start with an assumption of a relational model.  Most are table based, a few are columnar/relational.  They all fall into the same trap.  They assume the issue is with SQL and that they need to find a faster way to do SQL.  They haven't yet gotten to the real problem.

The relational assumption is the core problem.  It doesn't match the native state of information in nature, so it requires translation routines.  The problem isn't that SQL is badly designed.  SQL is forced to do massive transforms because of the relational model.  Trying to make SQL faster misses the point that SQL does massive amounts of unproductive work.  In normalized relational structures (many tables) the amount of unproductive work expands exponentially.  In de-normalized relational structures (one or few tables) the schema structure is an add on.  In either case scaled performance decays exponentially.

Ancelus eliminates all these problems by eliminating the relational storage structure and replacing it with a mathematical model.  The physical storage model is purely abstract.  The logical structure is purely native information - linked and recursive lists. Columns and tables are abstractions, mathematically derived.

The result is that instead of handling data many times, discarding the uninteresting, Ancelus only handles the interesting stuff - the final results set.  Dramatic reduction in computing and network load.

...
Hits: 1826
0

Posted by on in Database

Ancelus 6 is released.  Includes performance improvements.  New benchmarks will publish soon.

Hits: 1339
0

Posted by on in Database

The latest update for the Ancelus database, version 6.0, has been installed in beta mode at a customer site.  Includes major improvements in varchar speed and memory utilization.

GA release target date is June 1. 

 

Hits: 1479
0

Posted by on in Database

There seem to be two trends in dealing with the explosion of data from the Internet of Things.  The dominant theme seems to be focused on the efficient administrative tasks in handling bigger datasets.  This approach has no end game that we can see, since the most successful examples are de-normalized data structures struggling to deal with the write-speed constraints.  This means you've duplicated massive amounts of data.  So you solve the excess data problem by increasing the amount of data?  

Hadoop and Cassandra use two different approaches, but they both involve a vast array of hardware.  And neither has a reputation for addressing the real issue of cutting "time-to-insight."

The second is the still-small use of streaming analytics. Do it on the fly and focus on the efficiency of the data scientist rather than the admin. Most of these solutions deliver limited scope, but we've taken a different approach.  First we use the extreme performance of Ancelus to enable more precise flagging of interesting trends.  First statistics operate on the live data stream.  Then we support executable stored procedures to accelerate the data scientist's in-depth analysis, but only after pointing to the area of interest.  No fishing expeditions needed.

Our streaming toolset is called A3: Ancelus Adaptive Analytics and it serves both purposes without the hardware explosion implied by physically de-normalized data.  The Ancelus logical structure is 100% normalized and 100% de-normalized at the same time.  No data duplication, no pre-defined storage structure, so the logical structure is unconstrained.  It can even be changed on the fly without downtime.

New technology, new game.

Hits: 3071
0

Posted by on in Database

The white paper at the link explains the root cause of the massive overload emerging from the Internet of Things, how it will affect current practice, and what to do about it.

 

The-Emerging-Crisis-in-Big-Data.pdf

Hits: 1978
0

Posted by on in Database

The Ancelus database has always been available to run on hosted servers, Amazon Web Services being the most common.  But we've launched a project this week to modify the architecture to support a new pricing model.

Our first step is to implement our demo system on AWS to eliminate the need for a local install.  Local systems often ran into version conflicts between php and Apache that took time to sort.  The new system will be pre-installed on Red Hat and will be immediately operable.  Sign up and start using it. By using AWS we can implement large memory, disk and CPU configurations for temporary use without having to own the hardware. 

The second step will be to add tracking functions to support transaction pricing and other usage based pricing strategies. This seems to be the other major complaint against site installed systems. 

This will also let us offer specific benchmark data comparing various cloud offerings. We continue to hear of major performance differences, but with no current way to validate how it affects Ancelus.

Stay tuned.  This could be fun.

Hits: 3039
0

Posted by on in Database

One of the common questions we get in our monthly webinars relates to the behavior of Hadoop in big data analytics.  The root cause of the problem that Hadoop purports to solve is found in the nature of disk drives and how they interact with all structured data storage systems.

First some perspective on the origins of the Big Data discussion. The following chart shows the response characteristic of large data sets over the past 30 years.  In the life of relational databases we have seen exponential growth in the storage density of disk drives. This has been followed by proportional increase in the size of large databases.  Unfortunately the response time of these systems has degraded along the same exponential curve as storage density growth.  The reason is that there has been almost no improvement in retrieval time from a disk.  Storage density (and dataset size) has increased according to Moore's law.  Retrieval time increases according to Newton's laws of motion.  The crossover point about a decade ago marked an irreversible tipping point for all disk based systems.

 b2ap3_thumbnail_Big-Data.png

To solve this problem the analytics industry has moved away from relational databases in favor of de-normalized storage structures like Hadoop and SAS.  This approach eliminates the time consuming joins of the relational world, but does so at the expense of storage efficiency.  This explodes the size of the stored image from duplicate data and sparse matrix issues (the reason for relational databases in the first place).  The solution is at best temporary. In most cases it's a reversion to concepts of the late 1950s.   

The entire design goal of the Ancelus database was to eliminate the need for these Hobson's choices.  Extreme speed, extreme size, extreme complexity, extreme scaling, non-stop operation, live time-series pipelines, and much more.  All in the same system, record-setting and patented system.

Hits: 2472
0
Recent comment in this post - Show all comments
  • Ananthi
    Ananthi says #
    After reading this blog i very strong in this topics and this blog really helpful to all... explanation are very clear so very eas

This question comes up often, so we probably need a better explanation.

The traditional list of database types generally includes the following: 

  • Hierarchical
  • Network
  • Relational
  • Object
  • Semi-structured
  • Associative
  • Entity-attribute-value
  • Context Model
  • Graph.    


A concise description of each can be found HERE. Each defines a different way of organizing data elements and relationships in a way that allows it to fit the two-dimensional structure of computer storage. In essence it is a model of the physical structure of the storage.

There is a new class of database that does away with the mapping and transforms:

  • Alogrithmic Database

This class uses the native logical model of information directly, without transform or mapping.  The physical store is decoupled, abstract and unpredictable.

...
Hits: 3282
0
Go to top View Our Stats