|
jamo
|
 |
« on: July 17, 2012, 07:19:24 am » |
|
I've raised this in IRC but haven't been able to hang around long enough to troubleshoot with any of the devs. I'm keen to help with the troubleshooting here but I need a wee bit of guidance. I'll outline where I've got to so far. Let me know if it would be better to open a trac ticket for this. I have a 10.04 system with headless-core and a few MDs. I recently added a second SATA hdd to the system to store my media. I followed the wiki http://wiki.linuxmce.org/index.php/Add_an_additional_hard_drive entry and all went well (aside: I created an ext4 filesystem as the xfs filesystem wasn't available as an option? Perhaps this is outdated?) up to the point where I expect a message to pop up on the orbiters saying a drive had been detected, do I want to use it. No message. I then, as per L3MCE's suggestion, manually ran /usr/pluto/bin/Storage_Devices_Radar.sh (or similar) but still no message. I hacked the above script to figure out what was wrong and as far as I can tell, the script is doing what it's supposed to do- basically figuring out that there is a new internal hard drive on the system that is not currently used and then sending all that as a message using /usr/pluto/bin/MessageSend.sh with all the appropriate parameters (eg. device template, size, path etc etc). L3 seems to think there is something wrong with this script but I don't think so because it is working fine on the MDs (exact same script)... I couldn't test with a HDD but I tested with a USB flashdrive and it pops up the alert perfectly. The same USB flashdrive plugged into the core does not pop up an alert. So my thinking is there is something wrong *after* the message is sent and it is something specific to my core or to headless cores in general. What I'd like to know is this- What device/plugin/script/code is supposed to deal with/listen to/respond to the message that is sent? I can see the message in the dcerouter log... it gets sent every 10min... but I don't know what is supposed to happen then so I can figure out why it isn't. If someone can point me in that direction I can do a bit more digging. I can post the message that is sent when I get home as well if that will help but from what I recall the "from device" is 1 (core) and the "to device" is -1001 which apparently is used for events where it isn't really aimed at a device per se.... the device template is 1790 (int hard drive) and a few other parameters. Any pointers welcome.
|
|
|
|
« Last Edit: July 31, 2012, 10:22:02 am by jamo »
|
Logged
|
|
|
|
|
JaseP
|
 |
« Reply #1 on: July 17, 2012, 01:59:42 pm » |
|
Does the drive show up in the fstab,... and what are the contents of the fstab?
|
|
|
|
|
Logged
|
See my User page on the LinuxMCE Wiki for a description of my system configuration (click the little globe under my profile pic).
|
|
|
|
tschak909
|
 |
« Reply #2 on: July 17, 2012, 05:05:38 pm » |
|
To explain the process:
First, to get this out of the way, NO disks other than / should EVER show up in the fstab. EVER.
The way it works is this:
There is a StorageDevices_Radar.sh that is called every few seconds, that scans the system looking for viable disks. This script tries to prune out any false positives (the system disk, any swap disks...anything we can't really read.) The results are collated, checked against the database for existing disks, and a MessageSend is called for each newly detected device.
This is then intercepted by the Plug and Play plugin, which shoves the detected device into the PnpQueue table. The Plug and Play plugin has a thread which checks this table, matching it against a complex mash of heuristics (even going as far as attempting to determine if a particular device has moved machines, etc.) and the plugin then emits SCREEN messages to appropriate orbiters to act upon this information, matching against a device category, a specific template (the internal disk radar passes back a specific template, #1790), specific vendor/model information, MAC Addresses, etc. The Plug and Play plugin then calls CreateDevice to create a device for the tree, filling in device data that matches the entries in pnpqueue. Then the Plug and Play plugin, calls Configure_1790.sh in this case because it is specified in the Configuration Script device data for the template. This script is passed enough parameters to be able to set up LinuxMCE file structure, if needed, and to do some small bits of housecleaning and other logic. Once all this is done, the router should be reloaded.
Once this is done, /etc/auto.plutoStorageDevices now can actively query the database and figure out whether the disk is local (in which case a local mount is done), or whether it is remote (in which case, a network mount is done). this is called each and every time the directory /mnt/device/XX (where XX is the disk device # in the device tree) is accessed, and therefore the disk is automatically mounted at this time. After a small timeout of no disk activity, the disk is umounted, repeat for each time the filesystem is accessed. As part of the setup process, symbolic links to /mnt/device/XX are created in /home/public and each /home/user for any disks specified as LinuxMCE File Structure, OR, it is symlinked in the other/ directory under /home/public or /home/user_X (selected user). This is so UpdateMedia can easily traverse all the disks that LinuxMCE knows about, and gather filesystem data from it. This also explains why when you use certain filesystems that emit log messages when mounted, you see constant mounting, unmounting. UpdateMedia is crawling the disk.
-Thom
|
|
|
|
|
Logged
|
|
|
|
|
l3mce
|
 |
« Reply #3 on: July 17, 2012, 05:43:13 pm » |
|
as a test, please replace
"$DD_UUID|$partition_uuid|277|$partition_size" with just 277 in the messagesend line of StorageDevices_Radar.sh and reload the router, and plug in a USB device.
I do not believe this is relevant but would like you to do it anyway.
Then I would like you paste your /etc/pluto.conf file here please.
|
|
|
|
|
Logged
|
I never quit... I just ping out.
|
|
|
|
JaseP
|
 |
« Reply #4 on: July 17, 2012, 06:52:16 pm » |
|
First, to get this out of the way, NO disks other than / should EVER show up in the fstab. EVER.
Did not know that about LinuxMCE, ... thanks,... that's informative. (BTW, mine has the home partition in it, since I have a separate / and /home). Ok, jamo... try,... cat /proc/mounts > mounted_file_systems.txt What's in the text file? I'm guessing the drive isn't being mounted.
|
|
|
|
« Last Edit: July 17, 2012, 06:54:19 pm by JaseP »
|
Logged
|
See my User page on the LinuxMCE Wiki for a description of my system configuration (click the little globe under my profile pic).
|
|
|
|
l3mce
|
 |
« Reply #5 on: July 17, 2012, 06:57:34 pm » |
|
JaseP
The drive is only mounted for a brief second to extract the data to send a message describing what it is. If you answer no not now to that query, it is not mounted, and a message is rebroadcast some time later as you decide what you want to do with it. If you say no, and dont bother me again, it will never be mounted. If you answer yes, it is mounted and a symlink is created from its device number.
It is sending a message... this message is not reaching the orbiters. This is a headless install. It is my belief that the only thing receiving a message is the only thing incapable of displaying a message, and that is the headless core.
The problem exists between message send and the orbiters that are active. The system can never receive confirmation for use.
|
|
|
|
« Last Edit: July 17, 2012, 06:59:35 pm by l3mce »
|
Logged
|
I never quit... I just ping out.
|
|
|
|
JaseP
|
 |
« Reply #6 on: July 17, 2012, 07:24:03 pm » |
|
Ahhh,... OK,... Sorry,... My bad,... I was under the mistaken impression that the drive wasn't showing up at all on the Core...
Now that I think of it,... Didn't Jamo say, a while back, that the Core had been a Core/Hybrid, but he switched it to a headless? Could it be something in the conversion process that some part of the system thinks the Core still functions as an MD, and is routing its alert messages to a MD that doesn't exist any more?
|
|
|
|
|
Logged
|
See my User page on the LinuxMCE Wiki for a description of my system configuration (click the little globe under my profile pic).
|
|
|
|
jamo
|
 |
« Reply #7 on: July 17, 2012, 09:39:16 pm » |
|
Thom, that info is pure gold. Sweeet. OK, look forward to digging further when I get back from silly corporate ra-ra event. In the meantime, let me do the test. Made the change to the script, reloaded router, plugged in portable usb drive to router. No messages on any MDs. /etc/pluto.conf # Pluto config file MySqlHost = localhost MySqlUser = root MySqlPassword = MySqlDBName = pluto_main DCERouter = localhost MySqlPort = 3306 DCERouterPort = 3450 PK_Device = 1 Activation_Code = 1111 PK_Installation = 1028691 PK_Users = 1 PK_Distro = 18 Display = 0 SharedDesktop = 1 OfflineMode = false #<-mkr_b_videowizard_b-> UseVideoWizard = 1 #<-mkr_b_videowizard_e-> LogLevels = 1,5,7,8 #ImmediatelyFlushLog = 1 AutostartCore=1 AutostartMedia=0 FirstBoot = false AVWizardOverride = 1 TimeZoneSet = 1 LastSearchTokenUpdate=1341890379 DVDKeysCache = /home/.dvdcss PlutoVersion = 2.0.0.45.12070126149 Bookmark_Media = 4,5 RA_CheckRemotePort = 1
|
|
|
|
|
Logged
|
|
|
|
|
l3mce
|
 |
« Reply #8 on: July 17, 2012, 11:07:23 pm » |
|
Ahhh,... OK,... Sorry,... My bad,... I was under the mistaken impression that the drive wasn't showing up at all on the Core...
Now that I think of it,... Didn't Jamo say, a while back, that the Core had been a Core/Hybrid, but he switched it to a headless? Could it be something in the conversion process that some part of the system thinks the Core still functions as an MD, and is routing its alert messages to a MD that doesn't exist any more?
It is my understanding that he chose a core only install. He will have to clarify. If so, this is likely my fault, as I chunk out a bunch of stuff which would normally be written to the DB for the ability to be an MD... some missing parameter is probably at fault for messages stopping at a non existent hybrid field.
|
|
|
|
|
Logged
|
I never quit... I just ping out.
|
|
|
|
tschak909
|
 |
« Reply #9 on: July 18, 2012, 12:06:35 am » |
|
plug and play messages are supposed to make it to any visible standard orbiter after a timeout: from PnpQueue.cpp 801 LoggerWrapper::GetInstance()->Write(LV_STATUS,"PnpQueue::Process_Detect_Stage_Prompting_User_For_DT queue %d multiple choices", 802 pPnpQueueEntry->m_pRow_PnpQueue->PK_PnpQueue_get()); 803 #endif 804 DCE::SCREEN_NewPnpDevice_DL SCREEN_NewPnpDevice_DL(m_pPlug_And_Play_Plugin->m_dwPK_Device, pPnpQueueEntry->m_sPK_Orbiter_List_For_Prompts, GetDescription(pPnpQueueEntry), pPnpQueueEntry->m_pRow_PnpQueue->PK_PnpQueue_get(),interuptOnlyAudio,false,true); 805 m_pPlug_And_Play_Plugin->SendCommand(SCREEN_NewPnpDevice_DL); 806 return false; // Now we wait 807 }
from PnpQueue.h: class OH_Orbiter *m_pOH_Orbiter_Active; // The Orbiter to use for displaying messages
This is a struct that is borrowed from Orbiter_Plugin, a little bit of a scan, reveals in Plug_and_Play.cpp: //<-dceag-c700-b-> 245 246 /** @brief COMMAND: #700 - Choose Pnp Device Template */ 247 /** We have chosen a new pnp device template */ 248 /** @param #57 PK_Room */ 249 /** The room this is in. 0 if not known */ 250 /** @param #150 PK_DHCPDevice */ 251 /** The template for the device */ 252 /** @param #224 PK_PnpQueue */ 253 /** The queue entry we're selecting for */ 254 255 void Plug_And_Play_Plugin::CMD_Choose_Pnp_Device_Template(int iPK_Room,int iPK_DHCPDevice,int iPK_PnpQueue,string &sCMD_Result,Message *pMessage) 256 //<-dceag-c700-e-> 257 { 258 PLUTO_SAFETY_LOCK(pnp,m_PnpMutex); 259 PnpQueueEntry *pPnpQueueEntry = m_pPnpQueue->m_mapPnpQueueEntry_Find(iPK_PnpQueue); 260 if( !pPnpQueueEntry ) 261 { 262 LoggerWrapper::GetInstance()->Write(LV_CRITICAL, "PnpQueue::PickDeviceTemplate queue %d is invalid", iPK_PnpQueue); 263 return; 264 } 265 266 pPnpQueueEntry->m_pOH_Orbiter_Active_set(pMessage->m_dwPK_Device_From); 267 LoggerWrapper::GetInstance()->Write(LV_STATUS,"Plug_And_Play_Plugin::CMD_Choose_Pnp_Device_Template queue %d set to orbiter %p/%d", 268 pPnpQueueEntry->m_pRow_PnpQueue->PK_PnpQueue_get(), pPnpQueueEntry->m_pOH_Orbiter_Active_get(), pMessage->m_dwPK_Device_From); 269 pPnpQueueEntry->m_EBlockedState=PnpQueueEntry::pnpqe_blocked_none; 270 if( iPK_DHCPDevice ) 271 { 272 pPnpQueueEntry->m_iPK_DHCPDevice = iPK_DHCPDevice; 273 DCE::CMD_Remove_Screen_From_History_DL CMD_Remove_Screen_From_History_DL( 274 m_dwPK_Device, m_pOrbiter_Plugin->m_sPK_Device_AllOrbiters, StringUtils::itos(pPnpQueueEntry->m_pRow_PnpQueue->PK_PnpQueue_get()), SCREEN_NewPnpDevice_CONST); 275 DCE::CMD_Remove_Screen_From_History_DL CMD_Remove_Screen_From_History_DL2( 276 m_dwPK_Device, m_pOrbiter_Plugin->m_sPK_Device_AllOrbiters, StringUtils::itos(pPnpQueueEntry->m_pRow_PnpQueue->PK_PnpQueue_get()), SCREEN_New_Pnp_Device_One_Possibility_CONST); 277 CMD_Remove_Screen_From_History_DL.m_pMessage->m_vectExtraMessages.push_back(CMD_Remove_Screen_From_History_DL2.m_pMessage); 278 SendCommand(CMD_Remove_Screen_From_History_DL); 279 } 280 if( iPK_Room ) 281 pPnpQueueEntry->m_iPK_Room = iPK_Room; 282 pthread_cond_broadcast( &m_PnpCond ); 283 }
If you keep looking, you'll find code that m_bUseAllOrbiters gets set periodically, if the initial device_from being sent to doesn't respond quickly enough. This is done with each scanning of the pnpqueue entry. I leave this as an exercise for the reader to figure out where this is. if( pPnpQueueEntry->m_EBlockedState==PnpQueueEntry::pnpqe_blocked_prompting_options ) pPnpQueueEntry->UseAllOrbitersForPrompt(); // The user isn't responding. Ask on all orbiters
Which is parsed by PnpQueueEntry, to become: void PnpQueueEntry::UseAllOrbitersForPrompt() { m_bUseAllOrbitersForPrompt=true; m_sPK_Orbiter_List_For_Prompts=m_pPlug_And_Play_Plugin->m_pOrbiter_Plugin_get()->m_sPK_Device_AllOrbiters_get(); #ifdef DEBUG LoggerWrapper::GetInstance()->Write(LV_STATUS, "PnpQueueEntry::UseAllOrbitersForPrompt queue %d", m_pRow_PnpQueue->PK_PnpQueue_get()); #endif }
Seriously guys, STOP JUST BLIND ASS GUESSING, AND ACTUALLY LOOK AT OUR CODE! -Thom
|
|
|
|
« Last Edit: July 18, 2012, 12:11:49 am by tschak909 »
|
Logged
|
|
|
|
|
l3mce
|
 |
« Reply #10 on: July 18, 2012, 12:20:26 am » |
|
The message is being sent, and nothing is happening. I take this to mean that something in the message is malformed because the device itself is missing chunks of assignment in the DB. The only difference between his install and everyone elses is that he chose the headless install... so specifically I believe the problem lies in this never occurring during install: /usr/pluto/bin/CreateDevice -d $DEVICE_TEMPLATE_MediaDirector -C "$Core_PK_Device" Hybrid_DT=$(RunSQL "SELECT PK_Device FROM Device WHERE FK_DeviceTemplate='$DEVICE_TEMPLATE_MediaDirector' LIMIT 1") Q="UPDATE Device SET Description='The core/hybrid' WHERE PK_Device='$Hybrid_DT'" RunSQL "$Q" ## Set UI interface Q="SELECT PK_Device FROM Device WHERE FK_Device_ControlledVia='$Hybrid_DT' AND FK_DeviceTemplate=62" OrbiterDevice=$(RunSQL "$Q") But that is just a blind ass guess.
|
|
|
|
« Last Edit: July 18, 2012, 12:22:11 am by l3mce »
|
Logged
|
I never quit... I just ping out.
|
|
|
|
tschak909
|
 |
« Reply #11 on: July 18, 2012, 12:28:24 am » |
|
Does the PnpQueue table produce anything interesting?
-Thom
|
|
|
|
|
Logged
|
|
|
|
|
jamo
|
 |
« Reply #12 on: July 18, 2012, 08:58:00 pm » |
|
It is my understanding that he chose a core only install. He will have to clarify. If so, this is likely my fault, as I chunk out a bunch of stuff which would normally be written to the DB for the ability to be an MD... some missing parameter is probably at fault for messages stopping at a non existent hybrid field.
Correct, core-only install. First time 'round I set it up as hybrid when I was testing the sandy bridge stuff but once we got that out of the way I did core only nice and clean. Will dig more when I get home.
|
|
|
|
|
Logged
|
|
|
|
|
jamo
|
 |
« Reply #13 on: July 20, 2012, 07:24:13 am » |
|
Does the PnpQueue table produce anything interesting?
-Thom
Will look.
|
|
|
|
|
Logged
|
|
|
|
|
jamo
|
 |
« Reply #14 on: July 20, 2012, 07:32:39 pm » |
|
Oops. I think I broke it. Plan at this stage is to download the latest snapshot and start with a clean install. Willl take a week or so because I have to download the snapshot at work. I'll report back when I get back to this stage. 
|
|
|
|
|
Logged
|
|
|
|
|