The size of the data stored in Germain Datastore depends on two factors:
Technology being monitored
The amount of data generated by the technology you are monitoring will vary based on factors such as the number of users, the volume of transactions, the complexity of the application, and the frequency of data updates. Different technologies may produce varying amounts of data.
Database technology used
The choice of database technology to store the data also impacts the storage size. Different database technologies have different storage efficiencies, compression algorithms, and data organization methods, which can affect the overall size of the stored data.
To determine the specific size requirements for your scenario, it is recommended to perform a sizing exercise. This exercise involves analyzing the expected data volume, data retention policies, and other factors specific to your environment. Guidelines and best practices for data sizing can be provided by Germain, and they can assist you in estimating the storage requirements based on your specific monitoring and database setup.
By working closely with Germain, you can ensure that the storage capacity of the Germain Datastore aligns with your data storage needs, ensuring efficient utilization of resources and optimal performance.
Germain DB Sizing Example
In the example scenario you provided, with 1500 concurrent users using a CRM and an ERP application for 8 hours per day, and with a data retention policy of 60 days for raw data and 1 year for aggregated data, the database storage requirements will depend on several factors, including the data volume generated by the applications and the chosen database technology.
To estimate the storage requirements, you would need to consider the following:
Determine the average amount of data generated per user per hour for both the CRM and ERP applications. This could include factors such as the number of transactions, the size of each transaction, and any additional data being logged or captured.
Raw Data Retention
Multiply the average data volume per user per hour by the number of concurrent users and the number of hours per day. Then, multiply that result by the number of days you want to retain the raw data (in this case, 60 days). This will give you an estimate of the total raw data storage required.
Aggregated Data Retention
Aggregated data typically takes up less space than raw data. Determine the average size of the aggregated data per user per hour and then calculate the storage required for the number of concurrent users and the number of hours per day. Multiply this result by the number of days you want to retain the aggregated data (in this case, 1 year).
Overhead and Indexing
Consider any additional overhead and indexing requirements specific to your chosen database technology. These factors can affect the overall storage requirements.
It is important to note that these estimates are based on the provided example and can vary depending on the specific characteristics of your CRM and ERP applications, the types of data being generated, and the database technology being used. Consulting with Germain or conducting a sizing exercise with their support can help provide more accurate estimates based on your specific environment and requirements.
Database Storage Estimate
Excluding Session Replay
Assumed Unit Size
Data retention - Raw/RCA Data
Data retention - Aggregated Data
1-hour Session Replay Size
Concurrent users using your app
Hours/day a user uses your app
Total Estimated Storage
Note: When using AWS as the cloud provider, there is an option called "provisioned IOPS" where AWS guarantees the performance of the storage with a specific number of input/output operations per second (IOPS). This option can be useful when you require high-performance storage for your application.
However, if you want to optimize costs and the performance requirements of your application can be met with lower IOPS, you can choose the "GP2 (SSDs)" option. GP2 stands for General Purpose SSD, and it provides a balance between performance and cost. GP2 volumes offer burst performance for a general-purpose workload and are suitable for many applications.
By selecting GP2 volumes instead of provisioned IOPS, you can save costs while still maintaining good performance. It's important to note that the performance of GP2 volumes is influenced by the size of the volume. Larger volumes have higher baseline performance and burst performance limits.
In summary, if your application's storage requirements can be met with the performance characteristics of GP2 volumes, it is a cost-effective alternative to provisioned IOPS. The decision should be based on the specific needs of your application and workload.
Example: 2 Applications
2 min seek interval
5 min seek interval
Extreme User Activity
Expected User Activity
Idle User Activity
Extreme User Activity
Expected User Activity
Idle User Activity
Note: The network traffic generated by a monitored application can vary significantly between different versions or releases. It is essential to conduct a proper sizing analysis to understand and accommodate the network traffic requirements for each version of the application.
During a sizing analysis, you would evaluate factors such as the volume of network requests, the size of data transferred, the frequency of interactions, and any specific patterns or spikes in traffic. By assessing these aspects, you can determine the appropriate network capacity and resources needed to effectively monitor the application.
Sizing analysis helps ensure that the monitoring system can handle the network traffic generated by the application without experiencing performance issues or bottlenecks. It allows you to allocate sufficient resources, such as network bandwidth, processing power, and storage capacity, to accommodate the expected traffic patterns.
By performing a sizing analysis for each significant release or version of the monitored application, you can adapt the monitoring infrastructure accordingly and ensure accurate and reliable monitoring of network traffic. This proactive approach enables you to capture and analyze the necessary data for effective monitoring and gain valuable insights into the application's performance and behavior.