Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use temp directory instead of /opt for storing temporary downloaded schema files #17

Closed
cchou opened this issue Feb 6, 2012 · 15 comments
Labels

Comments

@cchou
Copy link
Member

cchou commented Feb 6, 2012

As xml resolution parse and download external links, it stores the downloaded files into its installed directory. Those downloaded files should go to the system temp directory as defined by TMP_DIR environment. It is to be consistent with other DAITSS software and also it's a better practice since those downloaded files are just temporary files.

For background information, see https://prb.fcla.edu/rt3/Ticket/Display.html?id=15929

@iterman
Copy link
Contributor

iterman commented Jul 26, 2012

XMLRESOLUTION code was modified to allocate a system temporary directory for storage of its schemas and collections (a collection in the context of xmlresolution - is a set of xml files associated with a package). The temporary directory and its contents, if there are any will be deleted when xmlresoliution is shut down. A schema is deleted from the schema's directory when no collection references the schema. A collection is deleted when core has requested the tar file from xmlresolution via the GET. The amount of storage xmlresolution consumes is greatly reduced.

@ghost ghost assigned iterman Jul 26, 2012
@lydiam
Copy link

lydiam commented Jul 26, 2012

What is the new location for collections and schemas? /tmp/folders?

@iterman
Copy link
Contributor

iterman commented Jul 26, 2012

The temp directory is created with Dir.mktmpdir Ruby statement. On the mac that seems to be /var/folder/ml and then the rest of te directory is something random each time xmlresolution is recyled like /hltshyv5241d35d3r904351m0000gp/T/d20120726-60100-1sxyq3r/. On darchive it is /var/daitss/tmp.

@cchou
Copy link
Member Author

cchou commented Jul 26, 2012

Per Ira, in FDA production this will be in /var/daitss/tmp

iterman added a commit that referenced this issue Jul 27, 2012
iterman added a commit that referenced this issue Jul 27, 2012
@iterman
Copy link
Contributor

iterman commented Sep 6, 2012

XMLResolution will store its collections, schemas, dtd's, and PI's in a temporary directory that is based upon the temp_directory setting of the config file. The setting will serve as the partial root directory. The entire directory is determined at runtime by Ruby and can be different each time XMLResolution recycles. The entire directory is deleted when XMLResolution is brought down. Example if temp_directory = /var/daitss/tmp then the temp directory will be something like /var/daitss/tmp/d20120828-15859-19ev0td. Schemas will stored in /var/daitss/tmp/d20120828-15859-19ev0td/schemas and collections in /var/daitss/tmp/d20120828-15859-19ev0td/collections.
Collections are deleted from the collections directory when XMLResolution serves up the tarball. A schemas is deleted when no collection refers to the schema. The name of the temporary directory is recorded in the XMLResolution's log.

@iterman
Copy link
Contributor

iterman commented Nov 6, 2012

Issue has been addressed.

@lydiam
Copy link

lydiam commented Nov 7, 2012

Jen has confirmed that this issue is resolved in production.

@lydiam lydiam closed this as completed Nov 7, 2012
@lydiam lydiam mentioned this issue Nov 15, 2012
@lydiam
Copy link

lydiam commented Nov 15, 2012

Per today's DAITSS meeting, the following will be backed out: "The temporary directory and its contents, if there are any will be deleted when xmlresoliution is shut down. A schema is deleted from the schema's directory when no collection references the schema. A collection is deleted when core has requested the tar file from xmlresolution via the GET. The amount of storage xmlresolution consumes is greatly reduced. " That should resolve #30.

@lydiam lydiam reopened this Nov 15, 2012
iterman added a commit that referenced this issue Nov 15, 2012
@lydiam
Copy link

lydiam commented Nov 20, 2012

Testing this code change on ripple after code update. 2 SIPs were archived and the collections are retained:

[lydiam@ripple tmp]$ sudo -u daitss ls -l d20121120-6166-1bzo8pz/collections
total 16
drwxr-xr-x 2 daitss daitss 4096 Nov 20 14:40 ETFR45O00_A3GB20
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:00 EXQ0VM1ZU_F27QI1

Stopping daitss (sudo /etc/init.d/daitss stop) removes the parent directory, which also contains schemas.

After daitss was restarted and 3 packages were archived, the following collections exist:

[lydiam@ripple tmp]$ sudo -u daitss ls -l d20121120-10998-f4h3it/collections
total 24
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:43 E98TX8Y2F_PGLV27
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:41 EC0P51O4U_KJFQ23
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:42 EVITHB8OW_TLPZL8

Jen will check tomorrow afternoon to confirm that these are gone.

@lydiam
Copy link

lydiam commented Nov 20, 2012

What exact mechanism causes the deletion of collections after 24 hours?

@iterman
Copy link
Contributor

iterman commented Nov 26, 2012

Lydia,

When the client request that a new collection be started, xmlresolution
checkall prior created
collections. If a collection is over 24 hours old it is deleted.

On 11/20/12 3:38 PM, Lydia Motyka wrote:

What exact mechanism causes the deletion of collections after 24 hours?


Reply to this email directly or view it on GitHub
#17 (comment).

@lydiam
Copy link

lydiam commented Nov 26, 2012

Per Ira, if there's no DAITSS activity collections won't be deleted.

@lydiam
Copy link

lydiam commented Nov 26, 2012

The following collections are no longer on ripple
[lydiam@ripple tmp]$ sudo -u daitss ls -l d20121120-10998-f4h3it/collections
total 24
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:43 E98TX8Y2F_PGLV27
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:41 EC0P51O4U_KJFQ23
drwxr-xr-x 2 daitss daitss 4096 Nov 20 15:42 EVITHB8OW_TLPZL8

Neither is the d20121120* directory, which contained the schemas. DAITSS has not been restarted since:

2012 Nov 20 15:40:39 ripple DAITSS: Starting apache services...

Why would the top-level directory be deleted?

@lydiam
Copy link

lydiam commented Nov 26, 2012

Apparently xmlresolution went down at 4:02 Nov 25, and this caused the top-level directory to be deleted and subsequently a new one was created this morning when a package was archived:

2012 Nov 25 04:02:34 ripple XmlResolution[11001]: INFO xmlresolution.ripple.fcla.edu: Temporary directory /var/daitss/tmp/d20121120-10998-f4h3it deleted
2012 Nov 25 04:02:34 ripple XmlResolution[11001]: INFO xmlresolution.ripple.fcla.edu: Ending XMLResolution Version 1.4.2, Git Revision 4d6b12f, Capistrano Release 20121120191144
2012 Nov 26 10:39:14 ripple XmlResolution[7848]: INFO xmlresolution.ripple.fcla.edu: Starting XMLResolution Version 1.4.2, Git Revision 4d6b12f, Capistrano Release 20121120191144
2012 Nov 26 10:39:14 ripple XmlResolution[7848]: INFO xmlresolution.ripple.fcla.edu: Initializing with temp data directory /var/daitss/tmp/d20121126-7848-1m5wocv; caching proxy is localhost:3128

(Note: xmlresolution went down at the same time on darchive. Both of these were due to a weekly 'service httpd reload'.)

@lydiam
Copy link

lydiam commented Nov 26, 2012

Based on Ira's log of the filesystem, the collections were deleted after 24 hours and xmlresolution running against a new file (apparently xmlresolution has to be active for collections to be deleted):

[[email protected]]/var/daitss/tmp>date
Wed Nov 21 15:44:05 EST 2012
[[email protected]]/var/daitss/tmp>sudo -u daitss find ./d20121120-10998-f4h3it
[sudo] password for fclilt:
./d20121120-10998-f4h3it
./d20121120-10998-f4h3it/schemas
./d20121120-10998-f4h3it/schemas/4f7be979467992b1784deb50e05c8113
./d20121120-10998-f4h3it/schemas/ec5120b03418fc241c69833afabb0f87
./d20121120-10998-f4h3it/schemas/dd53f6f1289578ce8615429bd78cc993
./d20121120-10998-f4h3it/schemas/574ec4333106b214ed368b8c9f3168d4
./d20121120-10998-f4h3it/schemas/65ce7317cf9a124abcd3ff3c0a2db497
./d20121120-10998-f4h3it/schemas/35490e5b6108c4a95bc0a402bccfb1ff
./d20121120-10998-f4h3it/schemas/26782808bcb6c6b7facc3769269fb598
./d20121120-10998-f4h3it/schemas/89d71e161a94c2fa424345f4c2bfb5d0
./d20121120-10998-f4h3it/schemas/8fc01397b63fb805f5d4d3c080c1565c
./d20121120-10998-f4h3it/schemas/d3b9d7b40ba4945a6f8fb9d672e542f5
./d20121120-10998-f4h3it/schemas/fdadfc0c3ec58b5b401fc9dc600a8450
./d20121120-10998-f4h3it/collections
./d20121120-10998-f4h3it/collections/EM9722LZP_49EEZ1
./d20121120-10998-f4h3it/collections/EM9722LZP_49EEZ1/b49a538d73e2ead3819d352637766022
./d20121120-10998-f4h3it/collections/EVITHB8OW_TLPZL8
./d20121120-10998-f4h3it/collections/EVITHB8OW_TLPZL8/3cdcd43d10a0da3d9c6e29002cb79d40
./d20121120-10998-f4h3it/collections/EVITHB8OW_TLPZL8/3fc3b67ff82df282626c2ebd78585b12
./d20121120-10998-f4h3it/collections/EC0P51O4U_KJFQ23
./d20121120-10998-f4h3it/collections/EC0P51O4U_KJFQ23/63721547c1a1604f2783791f4010c7d0
./d20121120-10998-f4h3it/collections/EC0P51O4U_KJFQ23/c5339da37fe92f37e223377be57bb77c
./d20121120-10998-f4h3it/collections/E98TX8Y2F_PGLV27
./d20121120-10998-f4h3it/collections/E98TX8Y2F_PGLV27/d4d0c607fce78a59bef280c330f3cd01
./d20121120-10998-f4h3it/collections/EIF0RSZ8R_YY11RT
./d20121120-10998-f4h3it/collections/EIF0RSZ8R_YY11RT/c8d25af9be0c4c44aa088c0d0db992da
./d20121120-10998-f4h3it/collections/EO1MMWAZP_QKMZP0
./d20121120-10998-f4h3it/collections/EO1MMWAZP_QKMZP0/c8d25af9be0c4c44aa088c0d0db992da
./d20121120-10998-f4h3it/collections/ENPQNX23H_HHVNAJ
./d20121120-10998-f4h3it/collections/ENPQNX23H_HHVNAJ/c8d25af9be0c4c44aa088c0d0db992da
./d20121120-10998-f4h3it/collections/E2V215A9R_L2JUF0
./d20121120-10998-f4h3it/collections/E2V215A9R_L2JUF0/c8d25af9be0c4c44aa088c0d0db992da

[[email protected]]/var/daitss/tmp>

file system after 24 hours and a new put collection E56I60HGF_5KFKTC

notice EVITHB8OW_TLPZL8 is not on the file system anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants