Issues

Select view

Select search mode

 
29 of 29

Progressively degrading performance saving MT to database

Ready for Release

Description

MT performance seems to worsen the longer it continues, for larger documents.
The time it takes to process the translations in a (Hibernate?) request at org.zanata.service.impl.TranslationServiceImpl#translate(...) seems to increase from 500~1000 every use, suggesting there's a cache or leak in this area.

The debugging shows:

02:53:44,330 INFO [stdout] (pool-11-thread-1) Starting batch 0-1000 02:53:44,336 INFO [stdout] (pool-11-thread-1) Sending batch 0-1000 02:53:46,993 INFO [stdout] (pool-11-thread-1) Received response [1000] (2654) 02:53:46,993 INFO [stdout] (pool-11-thread-1) Saving in a batch 02:53:46,993 INFO [stdout] (pool-11-thread-1) Saving batch 0-100 02:53:46,993 INFO [stdout] (pool-11-thread-1) Transaction... 02:53:46,996 INFO [stdout] (pool-11-thread-1) Creating save requests... 02:53:49,518 INFO [stdout] (pool-11-thread-1) Done creating (2522) 02:53:49,519 INFO [stdout] (pool-11-thread-1) Committing translations... 02:53:49,599 INFO [stdout] (pool-11-thread-1) Processing translations... 02:53:56,063 INFO [stdout] (pool-11-thread-1) Done processing (6463) 02:53:56,141 INFO [stdout] (pool-11-thread-1) Done committing translations. 02:53:56,510 INFO [stdout] (pool-11-thread-1) Done transaction 100-100(9517) 02:53:56,510 INFO [stdout] (pool-11-thread-1) Saving batch 100-200 02:53:56,510 INFO [stdout] (pool-11-thread-1) Transaction... 02:53:56,511 INFO [stdout] (pool-11-thread-1) Creating save requests... 02:53:59,692 INFO [stdout] (pool-11-thread-1) Done creating (3180) 02:53:59,692 INFO [stdout] (pool-11-thread-1) Committing translations... 02:53:59,694 INFO [stdout] (pool-11-thread-1) Processing translations... 02:54:06,895 INFO [stdout] (pool-11-thread-1) Done processing (7201) 02:54:06,897 INFO [stdout] (pool-11-thread-1) Done committing translations. 02:54:07,005 INFO [stdout] (pool-11-thread-1) Done transaction 200-200(10495) 02:54:07,005 INFO [stdout] (pool-11-thread-1) Saving batch 200-300 02:54:07,005 INFO [stdout] (pool-11-thread-1) Transaction... 02:54:07,005 INFO [stdout] (pool-11-thread-1) Creating save requests... 02:54:10,214 INFO [stdout] (pool-11-thread-1) Done creating (3209) 02:54:10,214 INFO [stdout] (pool-11-thread-1) Committing translations... 02:54:10,216 INFO [stdout] (pool-11-thread-1) Processing translations... 02:54:18,078 INFO [stdout] (pool-11-thread-1) Done processing (7862) 02:54:18,079 INFO [stdout] (pool-11-thread-1) Done committing translations. 02:54:18,236 INFO [stdout] (pool-11-thread-1) Done transaction 300-300(11231) 02:54:18,236 INFO [stdout] (pool-11-thread-1) Saving batch 300-400 02:54:18,236 INFO [stdout] (pool-11-thread-1) Transaction... 02:54:18,237 INFO [stdout] (pool-11-thread-1) Creating save requests... 02:54:21,858 INFO [stdout] (pool-11-thread-1) Done creating (3621) 02:54:21,858 INFO [stdout] (pool-11-thread-1) Committing translations... 02:54:21,860 INFO [stdout] (pool-11-thread-1) Processing translations... 02:54:31,698 INFO [stdout] (pool-11-thread-1) Done processing (9838) 02:54:31,700 INFO [stdout] (pool-11-thread-1) Done committing translations. 02:54:31,934 INFO [stdout] (pool-11-thread-1) Done transaction 400-400(13698)

Explanation: (^^ means return to)

org.zanata.service.impl.MachineTranslationServiceImpl#addMachineTranslationsToDoc
Sending batch 0-1000 sending a batch of 1000 textflows to MT provider
Received response [ batch size] (time) time taken to receive the result

org.zanata.service.impl.MachineTranslationServiceImpl#saveTranslationsInBatches
Saving in a batch immediately before the batch loop
Saving batch x-y immediately before sublist
Transaction... immediately before try { transaction.run(() ->

org.zanata.service.impl.MachineTranslationServiceImpl#makeUpdateRequestsForBatch
Creating save requests... before process loop
Done creating (time) time taken to create the list of TransUnitUpdateRequest

org.zanata.service.impl.MachineTranslationServiceImpl#saveTranslationsInBatches ^^
Committing translations... immediately before translationService.translate(targetLocale.getLocaleId(), updateRequests);

org.zanata.service.impl.TranslationServiceImpl#translate(org.zanata.common.LocaleId, java.util.List<org.zanata.webtrans.shared.model.TransUnitUpdateRequest>, boolean)
Processing translations... before for (TransUnitUpdateRequest request : translationRequests) {
Done processing (time) time taken to process the requests in loop

org.zanata.service.impl.MachineTranslationServiceImpl#saveTranslationsInBatches ^^
Done committing translations. immediately after translationService.translate
Done transaction 100-100(time) loop iteration finished
Done batches all results processed

Environment

None

Details

Assignee

Reporter

Labels

Tested Version/s

Components

Sprint

Fix versions

Affects versions

Priority

More fields

Created 31 July 2018 at 03:16
Updated 10 August 2018 at 01:43
Resolved 8 August 2018 at 00:16

Activity

Show:

Former user31 July 2018 at 08:20
Edited

An "entityManager.clear();" at the end of MachineTranslationServiceImpl#saveTranslationsInBatches seems to work, cutting the increasing cycle from 10+ seconds to 1.5, but breaks when adding attribution at
MachineTranslationServiceImpl#prefillProjectVersionWithMachineTranslation entityManager.merge(doc);

It seems there is a refresh or reload required somewhere, but I don't know where.

Further to this, adding attribution as a new transaction after the first has completed seems risky. If the transaction fails, perhaps being a large transaction based on doc size, there may be a lot of unattributed translations.