News:

Rule #1 - Be Patient - Rule #2 - Don't ask when, if you don't contribute - Rule #3 - You have coding skills - LinuxMCE's small brother is available: http://www.agocontrol.com

Main Menu

Huge query in pluto_media : taking my disk down to crawling

Started by bulek, September 20, 2013, 09:51:05 AM

Previous topic - Next topic

jamo

explain on my side

+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
| id | select_type | table          | type   | possible_keys      | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL               | NULL    | NULL    | NULL                                    | 3017 |             |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY,FatGroupBy | PRIMARY | 4       | s.PK_File                               |    1 |             |
|  2 | DERIVED     | File           | index  | Path,Filename      | PRIMARY | 4       | NULL                                    | 3231 | Using where |
|  2 | DERIVED     | Bookmark       | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File    | FK_File | 4       | pluto_media.File.PK_File                |    2 | Using index |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY            | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |    1 |             |
|  2 | DERIVED     | LongAttribute  | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | Picture_File   | ref    | FK_File            | FK_File | 4       | pluto_media.File.PK_File                |    1 |             |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
8 rows in set (0.10 sec)

bushtech

Ran it on my core. New 1004 installation so fairly clean. 60 odd dvd's on media drive currently. Using a ssd as OS driveRan it on my core. New 1004 installation so fairly clean. 60 odd dvd's on media drive currently. Using a ssd as OS drive




| id | select_type | table          | type   | possible_keys      | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL               | NULL    | NULL    | NULL                                    |  542 |             |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY,FatGroupBy | PRIMARY | 4       | s.PK_File                               |    1 |             |
|  2 | DERIVED     | File           | index  | Path,Filename      | PRIMARY | 4       | NULL                                    |  162 | Using where |
|  2 | DERIVED     | Bookmark       | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File    | FK_File | 4       | pluto_media.File.PK_File                |    3 | Using index |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY            | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |    1 |             |
|  2 | DERIVED     | LongAttribute  | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | Picture_File   | ref    | FK_File            | FK_File | 4       | pluto_media.File.PK_File                |    1 |             |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
8 rows in set (0.02 sec)



Jamo asked me to post this link to his explain code:  http://pastebin.com/jnQsJGAc











Core:Asus P8H77-V, Intel core i3-3240 LGA155, Inno 3D Nvidia 240 GT (DVI+VGA+HDMI), 4Gb ram, 125Gb SSD
MD: Zotax Zbox ID41

bulek

Hi,

here are the results for my database:

Quotemysql> select count(*) from Picture_File;
+----------+
| count(*) |
+----------+
|     3999 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from File;
+----------+
| count(*) |
+----------+
|  2422647 |
+----------+
1 row in set (1.37 sec)

mysql> select count(*) from File_Attribute;
+----------+
| count(*) |
+----------+
|    41181 |
+----------+
1 row in set (0.34 sec)

mysql> select count(*) from LongAttribute;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.02 sec)


Regards,
Bulek.
Thanks in advance,

regards,

Bulek.

tschak909

After letting updatemedia crawl my DB from a clean wipe of the pluto_media db, The resulting query time still is approximately 10 seconds.

I have put my pluto_media.sql here: https://96.226.245.101/work/pluto_media.sql ... if anyone wants to look.

-Thom

golgoj4


SELECT COUNT( * ) FROM FILE - 28847  Query took 0.0001 sec

SELECT COUNT( * )  FROM Picture_File - 6358 - is instant a time


select count(*) from File_Attribute; -  105293   - no seriously, is it?

1004 running samsung SSD




Linuxmce - Where everyone is never wrong, but we are always behind xbmc in the media / ui department.

tschak909

I think the purpose of posde asking for these numbers is to find where the bottlenecks would be in the large query. Asking for a linear return of a table, without any joins, will _always_ be fast.

-Thom

buckle

Making sure all the join conditions are indexed should help.


CREATE INDEX IX_Bookmark_FK_File
    ON Bookmark (FK_File);

CREATE INDEX IX_LongAttribute_FK_File
    ON LongAttribute (FK_File);


Using Thom's database on my machine I went from 20 seconds to < 3 seconds.


Typically I would check to make sure I had properly indexed anything in a where clause as well.

Adding one for Missing should also improve the query performance, although the difference for me was negligible.


CREATE INDEX IX_File_Missing
ON File(Missing);


FileName and Path indexes did not yield anything for me.  Here they are anyway...

CREATE INDEX ix_File_Path
ON File;

CREATE INDEX ix_File_FileName
ON File;


Hope this helps.

-buckle

Marie.O

If I helped you, feel free to buy me a coffee: [url="https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=2VKASZLTJH7ES"]https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=2VKASZLTJH7ES[/url]

phenigma

Here's a version that I'm testing now.  It fixes the CurrentDbAttrCount and HasAttributes values that were not reporting correctly in jamo's post.  I am no one as far as mysql goes but I've been bashing on this for a couple hours.  

The missing index was not implemented on my db.

Thanks jamo, buckle, bulek, tschak, and others for working on this query!  This has been a great collaboration!


SELECT
 File.PK_File,
 File.Path,
 File.Filename,
 File.Inode,  
 s.CurrentDbAttrDate,
 s.CurrentDbAttrCount,
 s.HasAttributes,  
 File.AttrDate AS OldDbAttrDate,
 File.AttrCount AS OldDbAttrCount,    
 File.ModificationDate AS OldFileDate,
 File.Source

FROM
(
 SELECT
   PK_File,
   greatest(
     (IF(Bookmark.psc_mod IS NULL, CAST('0000-00-00 00:00:00' AS DATE),Bookmark.psc_mod)),
     (IF(Attribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Attribute.psc_mod)),
     (IF(LongAttribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), LongAttribute.psc_mod)),
     (IF(Picture_File.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Picture_File.psc_mod)) ) As CurrentDbAttrDate,
   (select count(*) from Bookmark where FK_File=PK_File) as BOOKMARKCOUNT,
   (select count(*) from File_Attribute where FK_File=PK_File) as ACOUNT,    
   (select count(*) from LongAttribute where FK_File=PK_File) as LONGATTRCOUNT,
   (select count(*) from Picture_File where FK_File=PK_File) as IMGCOUNT,
   ((1000000 * (select BOOKMARKCOUNT)) +
     (10000 * (select ACOUNT)) +    
     (100 * (select LONGATTRCOUNT)) +
     (select IMGCOUNT)) as CurrentDbAttrCount,
   ((select ACOUNT) + (select LONGATTRCOUNT)) as HasAttributes

 FROM File
   LEFT JOIN Bookmark ON Bookmark.FK_File = PK_File
   LEFT JOIN File_Attribute ON File_Attribute.FK_File = PK_File
   LEFT JOIN Attribute ON File_Attribute.FK_Attribute = PK_Attribute
   LEFT JOIN LongAttribute ON LongAttribute.FK_File = PK_File
   LEFT JOIN Picture_File ON Picture_File.FK_File = PK_File

 WHERE 1 = 1
   AND ( Path LIKE '/home/public/data%' OR (Path ='/home/public' AND Filename = 'data' )  
     OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  
     OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  
     OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  
     OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )
     OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  
     OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  
     OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  
     OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' ) )  
   AND Missing = 0

   GROUP BY PK_File
) s

INNER JOIN File ON s.PK_File=File.PK_File;


J.
My setup: [url="http://wiki.linuxmce.org/index.php/User:Phenigma"]http://wiki.linuxmce.org/index.php/User:Phenigma[/url]

tschak909

Can we get at least one more successful test of this, before I plop this into UpdateMedia/MediaState.cpp ?

-Thom

buckle

I have a new query that also addressed the attribute counts issue with Jamo's version. 

I think it is about as simplified as I can make it.  I believe it was over-complicated from the beginning, at least for what the query set out to achieve.


SELECT
F.PK_File
,F.Path
,F.Filename
,F.Inode
,greatest(
IFNULL(B.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(A.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(LA.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(PF.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
CAST('0000-00-00 00:00:00' AS DATE)
) As CurrentDbAttrDate
,(1000000 * IFNULL(B.Bookmark_Count, 0)
+ 10000 * IFNULL(A.Attribute_Count, 0)
+ 100 * IFNULL(LA.Long_Attribute_Count, 0)
+ IFNULL(PF.Picture_Count, 0)
) AS CurrentDbAttrCount
,IFNULL(A.Attribute_Count,0)+IFNULL(LA.Long_Attribute_Count,0) AS HasAttributes
,F.AttrDate AS OldDbAttrDate
,F.AttrCount AS OldDbAttrCount
,F.ModificationDate AS OldFileDate
,F.Source
FROM
File F
LEFT JOIN (select MAX(psc_mod) as max_psc_mod,  FK_File as FK_File, count(*) as Bookmark_Count from Bookmark group by FK_File) B ON B.FK_File = F.PK_File
LEFT JOIN (select MAX(Attrib.psc_mod) as max_psc_mod, count(*) as Attribute_Count, FileAttrib.FK_File as FK_File from File_Attribute FileAttrib left join Attribute Attrib ON FileAttrib.FK_Attribute = Attrib.PK_Attribute group by FK_File) A on A.FK_File = F.PK_File
LEFT JOIN (select MAX(psc_mod) as max_psc_mod, FK_File as FK_File, count(*) as Long_Attribute_Count from LongAttribute group by FK_File) LA ON LA.FK_File = F.PK_File
LEFT JOIN (select MAX(psc_mod) as max_psc_mod, FK_File as FK_File, count(*) as Picture_Count from Picture_File group by FK_File) PF ON PF.FK_File = F.PK_File
WHERE 1 = 1
AND ( Path LIKE '/home/public/data%' OR (Path ='/home/public' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )
      OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' ) ) 
AND Missing = 0


I used phenigma's where clause for the sake of being thorough, and because it looked nice. :)

It compares exactly 100% against the original query in my test environment (from tschak's script). 

It performs pretty well, too.  Let me know how it works for you.

-buckle

phenigma

Re:
#26
Well...  As we found last night.  Buckle's query takes over 20 minutes to run on my 1004.  Jamo's with the additional corrections I made last night runs in 6 seconds.  The original takes over 25 seconds.  

http://pastebin.com/9XxjscCf <- jamo's with corrections

Sent from my SGH-I927 using Tapatalk 4
My setup: [url="http://wiki.linuxmce.org/index.php/User:Phenigma"]http://wiki.linuxmce.org/index.php/User:Phenigma[/url]

phenigma

I'm hoping we can get some more people to test the various versions and report back about the timings.

Sent from my SGH-I927 using Tapatalk 4
My setup: [url="http://wiki.linuxmce.org/index.php/User:Phenigma"]http://wiki.linuxmce.org/index.php/User:Phenigma[/url]

bushtech

phenigma:

Ran jamo's query from your last post:

But I get this error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that       corresponds to your MySQL server version for the right syntax to use near 'ON PK      _Attribute=FK_Attribute WHERE FK_File=PK_File),
(select (IFNULL(MAX(LongAtt' at line 22
Core:Asus P8H77-V, Intel core i3-3240 LGA155, Inno 3D Nvidia 240 GT (DVI+VGA+HDMI), 4Gb ram, 125Gb SSD
MD: Zotax Zbox ID41

jamo

Bushtek...

there should be no space betweekn "PK" and "_Attribute". I think copying and pasting has introduced an error. It should read "ON PK_Attribute=FK_Attribute" or something like that.

Quote from: bushtech on September 30, 2013, 07:21:37 AM
phenigma:

Ran jamo's query from your last post:

But I get this error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that       corresponds to your MySQL server version for the right syntax to use near 'ON PK      _Attribute=FK_Attribute WHERE FK_File=PK_File),
(select (IFNULL(MAX(LongAtt' at line 22