Request For Table Indexes?

For requests or help with our API
Post Reply
mandmdiet
Posts: 116
Joined: Tue Aug 17, 2021 9:37 am

Request For Table Indexes?

Post by mandmdiet » Tue Feb 06, 2024 10:13 am

Is it possible for us to make a request that OD add indexes to tables to help improve our ability to query data quickly?

We're using the Remote API to to poll some tables for changes. We intend to do that using the TIMESTAMP fields we found in these tables, but some (or most?) of the tables don't have an index on these fields. An example is for claimtracking, the field name is DateTimeEntry but there is no index on that field and so a query, on a database with a lot of data, can take upwards of 5 seconds to return.

If we can't find another way to optimize this on our side, does OD entertain requests to add indexes like this to a table? I know there are also potential downsides to adding indexes and that there is such a thing as too many indexes. They can bloat the table size and even slow other queries down, but this looks like an index that is genuinely needed. I'm guessing the Events API would want to take advantage of an index like this on all of the tables it supports too.

Thanks!

allends
Posts: 235
Joined: Fri Aug 23, 2013 11:29 am

Re: Request For Table Indexes?

Post by allends » Tue Feb 06, 2024 12:45 pm

Typically we do not entertain index requests, mostly due to the side effects you mentioned.
When we do tend to add indexes, it is because a system in Open Dental is demonstrably slow due to a lack of said index.

A 5+ second query time to poll a table for changes shouldn't be too terrible for most use cases, unless you are attempting to poll this table for real-time changes.
Allen
Open Dental Software
http://www.opendental.com

mandmdiet
Posts: 116
Joined: Tue Aug 17, 2021 9:37 am

Re: Request For Table Indexes?

Post by mandmdiet » Tue Feb 06, 2024 1:56 pm

Thanks for the reply allends

There are several tables in OD that we intend to poll on an interval. To save on number of requests to the OD Remote API, we want to poll those tables in a single UNION if possible. We don't know what kind of volume of data we'd expect to find in the wild, but we are worried that if a single table is taking 5 seconds and others are also taking a few seconds each then we might hit the timeout limits of the API.

But, the other thing we're worried about is the churn we'd cause in the database without an index if the table is large enough. Causing MySQL to do a table scan on a large table can affect disk usage, memory and CPU and we're concerned about the client's experience using Open Dental while we're running this query.

allends
Posts: 235
Joined: Fri Aug 23, 2013 11:29 am

Re: Request For Table Indexes?

Post by allends » Tue Feb 06, 2024 2:39 pm

I share your concerns about churn on the database, but the scope of this request increased by an indeterminate amount, since I am unsure how how many indexes we would need to add to an unknown amount of tables.

I am assuming, based on your concern, that your frequency for this polling interval is very rapid. Something under 1 minute would be my guess.
The API isn't really designed for polling of that kind outside of using our built-in events.
What is your intended use case for such a rapid interval of polling?
Allen
Open Dental Software
http://www.opendental.com

mandmdiet
Posts: 116
Joined: Tue Aug 17, 2021 9:37 am

Re: Request For Table Indexes?

Post by mandmdiet » Wed Feb 07, 2024 6:56 am

I had a good conversation with justine about this over the last couple of weeks. The details of what we're doing is in there.

https://www.opendentalsoft.com:8085/for ... f=9&t=8193

Here is a relevant tidbit to your question
We have users who, when using our software, will see a list of claims that can be submitted to insurance for collection. If our software is made aware of a change to one of these claims in Open Dental we update our claim data so that the user knows that claim has changed. Then, when the claim is submitted from our software, the claim has the most up-to-date information on it. If a claim is deleted, the user would likewise want to see the claim removed from our system so as not to submit a claim for collection that is no longer valid.

Given this scenario, we'd probably want to do something like this:

1. On a regular cadence, poll the tables using either the OD local API or [Remote] API to check for changes to claims. The query would be something like SELECT ClaimNum FROM Claim WHERE SecDateTEdit > LastPollTimestamp. This would give us a list of records that have changed in the Claim table since the last time we polled. So, if we polled every 2 minutes (for example) we'd only get results for 2 minutes worth of changes to that table. Querying every 2 minutes would be 720 requests per day, every minute would be 1,440 and every 30 seconds would be 2,880.
2. On a regular cadence, poll tables for deleted records. Something like SELECT ClaimNum FROM Claim WHERE ClaimNum IN (12345,12346, ...). This would give us a list of claims that haven't been deleted and by process of elimination we can figure out those that were deleted.

We have more than just the claim table to poll, but we could poll more than one table in a single request if there is a concern about the number of requests going through the API. This would work similar to how your Events API works where you can configure it to poll every N seconds.

So doing this kind of thing would put us well under the 86,400 limit, which, ostensibly, seems like it would be fine, right? It seems like the minimum we could do to support this feature. One last tidbit to add is that we plan on having up to hundreds of Open Dental instances on our software sometime this year.

allends
Posts: 235
Joined: Fri Aug 23, 2013 11:29 am

Re: Request For Table Indexes?

Post by allends » Thu Feb 08, 2024 9:02 am

I still don't have a full grasp of the scope of this request. There is an unknown number of tables that you are requesting to poll like this and it is unlikely that any of them have an index on a datetime column which Open Dental rarely, if ever, queries on.
It would be easier to assist you if I knew what the end goal of your project was going to be.
In the absence of that I can give only give basic guidance.

Concerns:
...a query, on a database with a lot of data, can take upwards of 5 seconds to return
A query taking five seconds to do a table scan and return data is unfortunate, but it is only really impactful if your interval for running that query is sufficiently fast. Running said query once an hour is much less impactful than trying to run it once every ten seconds obviously.
Designing your system so the interval you use can be longer is one way to address this problem.
This would give us a list of claims that haven't been deleted and by process of elimination we can figure out those that were deleted.
I surmise that part of your goal is to add/remove claims from your UI as they are added or removed in Open Dental.
Keeping two separate systems in sync in close to real-time is going to be costly, so I would propose you validate the state of objects when interacting with them instead.
Example: Claim gets sent in your UI > Attempt to update said Claim through the API > API returns that the ClaimNum doesn't exist > Remove Claim from your UI and notify customer.
This would give us a list of records that have changed in the Claim table since the last time we polled.
As I said above, these concerns are difficult to address without knowing your goals.
Let says, as an example, you attempting to mimic a system similar to the Batch Claim Send system in Open Dental, you could design your UI to reduce potential concurrency problems by limiting the claims available to only the claims for a single clinic. Then you would be getting all the unsent claims for a single clinic (and their additional info) from the API whenever a user interacted with the claims in any way or if they reloaded the window in some fashion. That way you are using indexed columns for your queries and your return size is generally going to be smaller.

Hopefully some of this is pertinent to your plans and can help you moving forward.
Allen
Open Dental Software
http://www.opendental.com

mandmdiet
Posts: 116
Joined: Tue Aug 17, 2021 9:37 am

Re: Request For Table Indexes?

Post by mandmdiet » Thu Feb 08, 2024 1:52 pm

Thanks Allends, yes I understand that not having visibility into our requirements makes this difficult. I apologize for that.

One thing I want to mention is I was just beginning this thread to understand whether Open Dental accepts requests to add indexes to tables if we discover we need them, so this conversation may be getting away from me a bit. At this point we're not really sure if this is going to be needed. All of the things mentioned here are accurate and are things we're working through but let's not make this a formal request for any index additions as we may determine that the polling interval and the tables we're looking at don't end up being a big issue index-wise, we just aren't sure at this point.

We have one client using Open Dental Cloud right now and we'll do some additional research with this client before we would broach a conversation formally asking for an index.


I won't go into too many details but I will try to give you that missing context you need. One of our requirements is that with regards to some data that is found in Open Dental, we keep our system in sync with the Open Dental system and the big reason for that is we're required to ensure that even in the event of an infrastructure failure at the practice, e.g. the office internet or server goes down or they have a power outage, etc., our users to be able to continue to work, submitting claims and other things. So, for example, if a claim changes in Open Dental, we would want to know as quickly as possible that the claim has changed and update our system. If the user attempts to send the claim and the office network is down, it won't matter for the users of our system. The interval we're thinking about is every 30 seconds at this point. Given that every 30 seconds the number of changed claims, for example, is likely to be 0 most of the time, we're not talking about a lot of data here, the real issue with regards to the data is just the data the query is moving through in the database from that table scan.

As a side note, we actually have a different implementation of this functionality using MySQL / MariaDB replication, but we can't employ that solution for databases hosted on the Open Dental Cloud. If it would be possible to have replication turned on and for our client to connect to the replication logs for that database then we would be golden and instead of polling we'd just be notified of changes when they happen. So, if there is any option for us to make that work we'd love to explore that with you.

However, we have been operating on the understanding that this won't be feasible and so are looking at something akin to the Events API that Open Dental provides. There are a few reasons the Events API isn't ideal for us, the most basic one being that the Events API only supports 3 tables at the moment. We could request that more tables be added to that, certainly, but us being in control of the format of the data coming from this feature as well as being able to collate updates across tables by a timestamp so that we know the order in which events occurred is very important for us as well, and harder with the Events API.

One thing we thought made sense, from a hypothetical scenario, is that if we were using the Events API and that we requested the Claim table be added to the Events API, wouldn't, to support that polling feature, it be necessary to add an index to the TIMESTAMP field in that table anyway? From what I understand, the Events API can be configured to poll every X seconds and that it would have to use that TIMESTAMP field to do the polling. So, we thought this would possibly be an easy ask since it seemed like that was an intended future use of some of these TIMESTAMP fields anyway if they began being supported in the Events API.

I know you're still concerned about the number of tables we're talking about so let me see if I can provide a list and if you would be amenable to adding indexes for the tables in the list. These are tables we're interested in knowing when they've been changed and plan to use the TIMESTAMP field (or in a couple cases a DateTime field) to determine that.

periomeasure, SecDateTEdit
procedurelog, DateTStamp
paysplit, SecDateTEdit
commlog, DateTStamp
procnote, EntryDateTime
fee, SecDateTEdit
adjustment, SecDateTEdit
claimproc, SecDateTEdit
payment, SecDateTEdit
appointment, DateTStamp
toothinitial, SecDateTEdit
claimtracking, DateTimeEntry
claim, SecDateTEdit
patient, DateTStamp
patientnote, SecDateTEdit
claimpayment, SecDateTEdit
perioexam, DateTMeasureEdit

While this is most of the tables we're interested in I have omitted several because I believe the data in those tables is likely to have a low record count in all practices and the query planner would probably do a table scan anyway. This list is ordered from the tables we expect to have the most records at the top and those with the lowest expected count at the bottom. However, in all frankness, we don't know what kinds of record counts or time delays we might see generally across the hundreds of Open Dental clients we're working with but we do anticipate these tables having enough records in most cases (or enough cases) to be a problem for polling without the index.

Thanks again Allends
Last edited by mandmdiet on Thu Feb 08, 2024 3:34 pm, edited 2 times in total.

Post Reply