From bbaecc088245e840e59a5abe23d69cf7748b3c88 Mon Sep 17 00:00:00 2001 From: Mike Frysinger Date: Mon, 13 Oct 2014 15:52:03 -0700 Subject: binfmt_misc: expand the register format limit to 1920 bytes The current code places a 256 byte limit on the registration format. This ends up being fairly limited when you try to do matching against a binary format like ELF: - the magic & mask formats cannot have any embedded NUL chars (string_unescape_inplace halts at the first NUL) - each escape sequence quadruples the size: \x00 is needed for NUL - trying to match bytes at the start of the file as well as further on leads to a lot of \x00 sequences in the mask - magic & mask have to be the same length (when decoded) - still need bytes for the other fields - impossible! Let's look at a concrete (and common) example: using QEMU to run MIPS ELFs. The name field uses 11 bytes "qemu-mipsel". The interp uses 20 bytes "/usr/bin/qemu-mipsel". The type & flags takes up 4 bytes. We need 7 bytes for the delimiter (usually ":"). We can skip offset. So already we're down to 107 bytes to use with the magic/mask instead of the real limit of 128 (BINPRM_BUF_SIZE). If people use shell code to register (which they do the majority of the time), they're down to ~26 possible bytes since the escape sequence must be \x##. The ELF format looks like (both 32 & 64 bit): e_ident: 16 bytes e_type: 2 bytes e_machine: 2 bytes Those 20 bytes are enough for most architectures because they have so few formats in the first place, thus they can be uniquely identified. That also means for shell users, since 20 is smaller than 26, they can sanely register a handler. But for some targets (like MIPS), we need to poke further. The ELF fields continue on: e_entry: 4 or 8 bytes e_phoff: 4 or 8 bytes e_shoff: 4 or 8 bytes e_flags: 4 bytes We only care about e_flags here as that includes the bits to identify whether the ELF is O32/N32/N64. But now we have to consume another 16 bytes (for 32 bit ELFs) or 28 bytes (for 64 bit ELFs) just to match the flags. If every byte is escaped, we send 288 more bytes to the kernel ((20 {e_ident,e_type,e_machine} + 12 {e_entry,e_phoff,e_shoff} + 4 {e_flags}) * 2 {mask,magic} * 4 {escape}) and we've clearly blown our budget. Even if we try to be clever and do the decoding ourselves (rather than relying on the kernel to process \x##), we still can't hit the mark -- string_unescape_inplace treats mask & magic as C strings so NUL cannot be embedded. That leaves us with having to pass \x00 for the 12/24 entry/phoff/shoff bytes (as those will be completely random addresses), and that is a minimum requirement of 48/96 bytes for the mask alone. Add up the rest and we blow through it (this is for 64 bit ELFs): magic: 20 {e_ident,e_type,e_machine} + 24 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 48 # ^^ See note below. mask: 20 {e_ident,e_type,e_machine} + 96 {e_entry,e_phoff,e_shoff} + 4 {e_flags} = 120 Remember above we had 107 left over, and now we're at 168. This is of course the *best* case scenario -- you'll also want to have NUL bytes in the magic & mask too to match literal zeros. Note: the reason we can use 24 in the magic is that we can work off of the fact that for bytes the mask would clobber, we can stuff any value into magic that we want. So when mask is \x00, we don't need the magic to also be \x00, it can be an unescaped raw byte like '!'. This lets us handle more formats (barely) under the current 256 limit, but that's a pretty tall hoop to force people to jump through. With all that said, let's bump the limit from 256 bytes to 1920. This way we support escaping every byte of the mask & magic field (which is 1024 bytes by themselves -- 128 * 4 * 2), and we leave plenty of room for other fields. Like long paths to the interpreter (when you have source in your /really/long/homedir/qemu/foo). Since the current code stuffs more than one structure into the same buffer, we leave a bit of space to easily round up to 2k. 1920 is just as arbitrary as 256 ;). Signed-off-by: Mike Frysinger Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/binfmt_misc.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/binfmt_misc.txt b/Documentation/binfmt_misc.txt index c1ed6948ba80..f64372b284e8 100644 --- a/Documentation/binfmt_misc.txt +++ b/Documentation/binfmt_misc.txt @@ -58,7 +58,7 @@ Here is what the fields mean: There are some restrictions: - - the whole register string may not exceed 255 characters + - the whole register string may not exceed 1920 characters - the magic must reside in the first 128 bytes of the file, i.e. offset+size(magic) has to be less than 128 - the interpreter string may not exceed 127 characters -- cgit v1.2.3 From 43bd40e5b6eab989a2186b09d45b8ff8efd127b2 Mon Sep 17 00:00:00 2001 From: Mike Frysinger Date: Mon, 13 Oct 2014 15:52:05 -0700 Subject: binfmt_misc: touch up documentation a bit Line wrap the content to 80 cols, and add more details to various fields to match the code. Drop reference to a website that does not exist anymore. Signed-off-by: Mike Frysinger Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/binfmt_misc.txt | 48 +++++++++++++++++++++++++------------------ 1 file changed, 28 insertions(+), 20 deletions(-) (limited to 'Documentation') diff --git a/Documentation/binfmt_misc.txt b/Documentation/binfmt_misc.txt index f64372b284e8..6b1de7058371 100644 --- a/Documentation/binfmt_misc.txt +++ b/Documentation/binfmt_misc.txt @@ -15,39 +15,50 @@ First you must mount binfmt_misc: mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc To actually register a new binary type, you have to set up a string looking like -:name:type:offset:magic:mask:interpreter:flags (where you can choose the ':' upon -your needs) and echo it to /proc/sys/fs/binfmt_misc/register. +:name:type:offset:magic:mask:interpreter:flags (where you can choose the ':' +upon your needs) and echo it to /proc/sys/fs/binfmt_misc/register. + Here is what the fields mean: - 'name' is an identifier string. A new /proc file will be created with this - name below /proc/sys/fs/binfmt_misc + name below /proc/sys/fs/binfmt_misc; cannot contain slashes '/' for obvious + reasons. - 'type' is the type of recognition. Give 'M' for magic and 'E' for extension. - 'offset' is the offset of the magic/mask in the file, counted in bytes. This - defaults to 0 if you omit it (i.e. you write ':name:type::magic...') + defaults to 0 if you omit it (i.e. you write ':name:type::magic...'). Ignored + when using filename extension matching. - 'magic' is the byte sequence binfmt_misc is matching for. The magic string - may contain hex-encoded characters like \x0a or \xA4. In a shell environment - you will have to write \\x0a to prevent the shell from eating your \. + may contain hex-encoded characters like \x0a or \xA4. Note that you must + escape any NUL bytes; parsing halts at the first one. In a shell environment + you might have to write \\x0a to prevent the shell from eating your \. If you chose filename extension matching, this is the extension to be recognised (without the '.', the \x0a specials are not allowed). Extension - matching is case sensitive! + matching is case sensitive, and slashes '/' are not allowed! - 'mask' is an (optional, defaults to all 0xff) mask. You can mask out some bits from matching by supplying a string like magic and as long as magic. - The mask is anded with the byte sequence of the file. + The mask is anded with the byte sequence of the file. Note that you must + escape any NUL bytes; parsing halts at the first one. Ignored when using + filename extension matching. - 'interpreter' is the program that should be invoked with the binary as first argument (specify the full path) - 'flags' is an optional field that controls several aspects of the invocation - of the interpreter. It is a string of capital letters, each controls a certain - aspect. The following flags are supported - - 'P' - preserve-argv[0]. Legacy behavior of binfmt_misc is to overwrite the - original argv[0] with the full path to the binary. When this flag is - included, binfmt_misc will add an argument to the argument vector for - this purpose, thus preserving the original argv[0]. + of the interpreter. It is a string of capital letters, each controls a + certain aspect. The following flags are supported - + 'P' - preserve-argv[0]. Legacy behavior of binfmt_misc is to overwrite + the original argv[0] with the full path to the binary. When this + flag is included, binfmt_misc will add an argument to the argument + vector for this purpose, thus preserving the original argv[0]. + e.g. If your interp is set to /bin/foo and you run `blah` (which is + in /usr/local/bin), then the kernel will execute /bin/foo with + argv[] set to ["/bin/foo", "/usr/local/bin/blah", "blah"]. The + interp has to be aware of this so it can execute /usr/local/bin/blah + with argv[] set to ["blah"]. 'O' - open-binary. Legacy behavior of binfmt_misc is to pass the full path of the binary to the interpreter as an argument. When this flag is included, binfmt_misc will open the file for reading and pass its descriptor as an argument, instead of the full path, thus allowing - the interpreter to execute non-readable binaries. This feature should - be used with care - the interpreter has to be trusted not to emit - the contents of the non-readable binary. + the interpreter to execute non-readable binaries. This feature + should be used with care - the interpreter has to be trusted not to + emit the contents of the non-readable binary. 'C' - credentials. Currently, the behavior of binfmt_misc is to calculate the credentials and security token of the new process according to the interpreter. When this flag is included, these attributes are @@ -110,7 +121,4 @@ passes it the full filename (or the file descriptor) to use. Using $PATH can cause unexpected behaviour and can be a security hazard. -There is a web page about binfmt_misc at -http://www.tat.physik.uni-tuebingen.de - Richard Günther -- cgit v1.2.3 From 87d672cbd512c8dca01423381c94ac3658db0a18 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 13 Oct 2014 15:52:24 -0700 Subject: autofs: the documentation I wanted to read This documents autofs from the perspective of what the module actually supports rather than how automount is expected to use it. It is formatted using "markdown" and works best with Markdown.pl (markdown_py doesn't like some constructs). [rdunlap@infradead.org: copy editing] Signed-off-by: NeilBrown Cc: Randy Dunlap Acked-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/autofs4.txt | 520 ++++++++++++++++++++++++++++++++++ 1 file changed, 520 insertions(+) create mode 100644 Documentation/filesystems/autofs4.txt (limited to 'Documentation') diff --git a/Documentation/filesystems/autofs4.txt b/Documentation/filesystems/autofs4.txt new file mode 100644 index 000000000000..39d02e19fb62 --- /dev/null +++ b/Documentation/filesystems/autofs4.txt @@ -0,0 +1,520 @@ + + + + +autofs - how it works +===================== + +Purpose +------- + +The goal of autofs is to provide on-demand mounting and race free +automatic unmounting of various other filesystems. This provides two +key advantages: + +1. There is no need to delay boot until all filesystems that + might be needed are mounted. Processes that try to access those + slow filesystems might be delayed but other processes can + continue freely. This is particularly important for + network filesystems (e.g. NFS) or filesystems stored on + media with a media-changing robot. + +2. The names and locations of filesystems can be stored in + a remote database and can change at any time. The content + in that data base at the time of access will be used to provide + a target for the access. The interpretation of names in the + filesystem can even be programmatic rather than database-backed, + allowing wildcards for example, and can vary based on the user who + first accessed a name. + +Context +------- + +The "autofs4" filesystem module is only one part of an autofs system. +There also needs to be a user-space program which looks up names +and mounts filesystems. This will often be the "automount" program, +though other tools including "systemd" can make use of "autofs4". +This document describes only the kernel module and the interactions +required with any user-space program. Subsequent text refers to this +as the "automount daemon" or simply "the daemon". + +"autofs4" is a Linux kernel module with provides the "autofs" +filesystem type. Several "autofs" filesystems can be mounted and they +can each be managed separately, or all managed by the same daemon. + +Content +------- + +An autofs filesystem can contain 3 sorts of objects: directories, +symbolic links and mount traps. Mount traps are directories with +extra properties as described in the next section. + +Objects can only be created by the automount daemon: symlinks are +created with a regular `symlink` system call, while directories and +mount traps are created with `mkdir`. The determination of whether a +directory should be a mount trap or not is quite _ad hoc_, largely for +historical reasons, and is determined in part by the +*direct*/*indirect*/*offset* mount options, and the *maxproto* mount option. + +If neither the *direct* or *offset* mount options are given (so the +mount is considered to be *indirect*), then the root directory is +always a regular directory, otherwise it is a mount trap when it is +empty and a regular directory when not empty. Note that *direct* and +*offset* are treated identically so a concise summary is that the root +directory is a mount trap only if the filesystem is mounted *direct* +and the root is empty. + +Directories created in the root directory are mount traps only if the +filesystem is mounted *indirect* and they are empty. + +Directories further down the tree depend on the *maxproto* mount +option and particularly whether it is less than five or not. +When *maxproto* is five, no directories further down the +tree are ever mount traps, they are always regular directories. When +the *maxproto* is four (or three), these directories are mount traps +precisely when they are empty. + +So: non-empty (i.e. non-leaf) directories are never mount traps. Empty +directories are sometimes mount traps, and sometimes not depending on +where in the tree they are (root, top level, or lower), the *maxproto*, +and whether the mount was *indirect* or not. + +Mount Traps +--------------- + +A core element of the implementation of autofs is the Mount Traps +which are provided by the Linux VFS. Any directory provided by a +filesystem can be designated as a trap. This involves two separate +features that work together to allow autofs to do its job. + +**DCACHE_NEED_AUTOMOUNT** + +If a dentry has the DCACHE_NEED_AUTOMOUNT flag set (which gets set if +the inode has S_AUTOMOUNT set, or can be set directly) then it is +(potentially) a mount trap. Any access to this directory beyond a +"`stat`" will (normally) cause the `d_op->d_automount()` dentry operation +to be called. The task of this method is to find the filesystem that +should be mounted on the directory and to return it. The VFS is +responsible for actually mounting the root of this filesystem on the +directory. + +autofs doesn't find the filesystem itself but sends a message to the +automount daemon asking it to find and mount the filesystem. The +autofs `d_automount` method then waits for the daemon to report that +everything is ready. It will then return "`NULL`" indicating that the +mount has already happened. The VFS doesn't try to mount anything but +follows down the mount that is already there. + +This functionality is sufficient for some users of mount traps such +as NFS which creates traps so that mountpoints on the server can be +reflected on the client. However it is not sufficient for autofs. As +mounting onto a directory is considered to be "beyond a `stat`", the +automount daemon would not be able to mount a filesystem on the 'trap' +directory without some way to avoid getting caught in the trap. For +that purpose there is another flag. + +**DCACHE_MANAGE_TRANSIT** + +If a dentry has DCACHE_MANAGE_TRANSIT set then two very different but +related behaviors are invoked, both using the `d_op->d_manage()` +dentry operation. + +Firstly, before checking to see if any filesystem is mounted on the +directory, d_manage() will be called with the `rcu_walk` parameter set +to `false`. It may return one of three things: + +- A return value of zero indicates that there is nothing special + about this dentry and normal checks for mounts and automounts + should proceed. + + autofs normally returns zero, but first waits for any + expiry (automatic unmounting of the mounted filesystem) to + complete. This avoids races. + +- A return value of `-EISDIR` tells the VFS to ignore any mounts + on the directory and to not consider calling `->d_automount()`. + This effectively disables the **DCACHE_NEED_AUTOMOUNT** flag + causing the directory not be a mount trap after all. + + autofs returns this if it detects that the process performing the + lookup is the automount daemon and that the mount has been + requested but has not yet completed. How it determines this is + discussed later. This allows the automount daemon not to get + caught in the mount trap. + + There is a subtlety here. It is possible that a second autofs + filesystem can be mounted below the first and for both of them to + be managed by the same daemon. For the daemon to be able to mount + something on the second it must be able to "walk" down past the + first. This means that d_manage cannot *always* return -EISDIR for + the automount daemon. It must only return it when a mount has + been requested, but has not yet completed. + + `d_manage` also returns `-EISDIR` if the dentry shouldn't be a + mount trap, either because it is a symbolic link or because it is + not empty. + +- Any other negative value is treated as an error and returned + to the caller. + + autofs can return + + - -ENOENT if the automount daemon failed to mount anything, + - -ENOMEM if it ran out of memory, + - -EINTR if a signal arrived while waiting for expiry to + complete + - or any other error sent down by the automount daemon. + + +The second use case only occurs during an "RCU-walk" and so `rcu_walk` +will be set. + +An RCU-walk is a fast and lightweight process for walking down a +filename path (i.e. it is like running on tip-toes). RCU-walk cannot +cope with all situations so when it finds a difficulty it falls back +to "REF-walk", which is slower but more robust. + +RCU-walk will never call `->d_automount`; the filesystems must already +be mounted or RCU-walk cannot handle the path. +To determine if a mount-trap is safe for RCU-walk mode it calls +`->d_manage()` with `rcu_walk` set to `true`. + +In this case `d_manage()` must avoid blocking and should avoid taking +spinlocks if at all possible. Its sole purpose is to determine if it +would be safe to follow down into any mounted directory and the only +reason that it might not be is if an expiry of the mount is +underway. + +In the `rcu_walk` case, `d_manage()` cannot return -EISDIR to tell the +VFS that this is a directory that doesn't require d_automount. If +`rcu_walk` sees a dentry with DCACHE_NEED_AUTOMOUNT set but nothing +mounted, it *will* fall back to REF-walk. `d_manage()` cannot make the +VFS remain in RCU-walk mode, but can only tell it to get out of +RCU-walk mode by returning `-ECHILD`. + +So `d_manage()`, when called with `rcu_walk` set, should either return +-ECHILD if there is any reason to believe it is unsafe to end the +mounted filesystem, and otherwise should return 0. + +autofs will return `-ECHILD` if an expiry of the filesystem has been +initiated or is being considered, otherwise it returns 0. + + +Mountpoint expiry +----------------- + +The VFS has a mechansim for automatically expiring unused mounts, +much as it can expire any unused dentry information from the dcache. +This is guided by the MNT_SHRINKABLE flag. This only applies to +mounts that were created by `d_automount()` returning a filesystem to be +mounted. As autofs doesn't return such a filesystem but leaves the +mounting to the automount daemon, it must involve the automount daemon +in unmounting as well. This also means that autofs has more control +of expiry. + +The VFS also supports "expiry" of mounts using the MNT_EXPIRE flag to +the `umount` system call. Unmounting with MNT_EXPIRE will fail unless +a previous attempt had been made, and the filesystem has been inactive +and untouched since that previous attempt. autofs4 does not depend on +this but has its own internal tracking of whether filesystems were +recently used. This allows individual names in the autofs directory +to expire separately. + +With version 4 of the protocol, the automount daemon can try to +unmount any filesystems mounted on the autofs filesystem or remove any +symbolic links or empty directories any time it likes. If the unmount +or removal is successful the filesystem will be returned to the state +it was before the mount or creation, so that any access of the name +will trigger normal auto-mount processing. In particlar, `rmdir` and +`unlink` do not leave negative entries in the dcache as a normal +filesystem would, so an attempt to access a recently-removed object is +passed to autofs for handling. + +With version 5, this is not safe except for unmounting from top-level +directories. As lower-level directories are never mount traps, other +processes will see an empty directory as soon as the filesystem is +unmounted. So it is generally safest to use the autofs expiry +protocol described below. + +Normally the daemon only wants to remove entries which haven't been +used for a while. For this purpose autofs maintains a "`last_used`" +time stamp on each directory or symlink. For symlinks it genuinely +does record the last time the symlink was "used" or followed to find +out where it points to. For directories the field is a slight +misnomer. It actually records the last time that autofs checked if +the directory or one of its descendents was busy and found that it +was. This is just as useful and doesn't require updating the field so +often. + +The daemon is able to ask autofs if anything is due to be expired, +using an `ioctl` as discussed later. For a *direct* mount, autofs +considers if the entire mount-tree can be unmounted or not. For an +*indirect* mount, autofs considers each of the names in the top level +directory to determine if any of those can be unmounted and cleaned +up. + +There is an option with indirect mounts to consider each of the leaves +that has been mounted on instead of considering the top-level names. +This is intended for compatability with version 4 of autofs and should +be considered as deprecated. + +When autofs considers a directory it checks the `last_used` time and +compares it with the "timeout" value set when the filesystem was +mounted, though this check is ignored in some cases. It also checks if +the directory or anything below it is in use. For symbolic links, +only the `last_used` time is ever considered. + +If both appear to support expiring the directory or symlink, an action +is taken. + +There are two ways to ask autofs to consider expiry. The first is to +use the **AUTOFS_IOC_EXPIRE** ioctl. This only works for indirect +mounts. If it finds something in the root directory to expire it will +return the name of that thing. Once a name has been returned the +automount daemon needs to unmount any filesystems mounted below the +name normally. As described above, this is unsafe for non-toplevel +mounts in a version-5 autofs. For this reason the current `automountd` +does not use this ioctl. + +The second mechanism uses either the **AUTOFS_DEV_IOCTL_EXPIRE_CMD** or +the **AUTOFS_IOC_EXPIRE_MULTI** ioctl. This will work for both direct and +indirect mounts. If it selects an object to expire, it will notify +the daemon using the notification mechanism described below. This +will block until the daemon acknowledges the expiry notification. +This implies that the "`EXPIRE`" ioctl must be sent from a different +thread than the one which handles notification. + +While the ioctl is blocking, the entry is marked as "expiring" and +`d_manage` will block until the daemon affirms that the unmount has +completed (together with removing any directories that might have been +necessary), or has been aborted. + +Communicating with autofs: detecting the daemon +----------------------------------------------- + +There are several forms of communication between the automount daemon +and the filesystem. As we have already seen, the daemon can create and +remove directories and symlinks using normal filesystem operations. +autofs knows whether a process requesting some operation is the daemon +or not based on its process-group id number (see getpgid(1)). + +When an autofs filesystem it mounted the pgid of the mounting +processes is recorded unless the "pgrp=" option is given, in which +case that number is recorded instead. Any request arriving from a +process in that process group is considered to come from the daemon. +If the daemon ever has to be stopped and restarted a new pgid can be +provided through an ioctl as will be described below. + +Communicating with autofs: the event pipe +----------------------------------------- + +When an autofs filesystem is mounted, the 'write' end of a pipe must +be passed using the 'fd=' mount option. autofs will write +notification messages to this pipe for the daemon to respond to. +For version 5, the format of the message is: + + struct autofs_v5_packet { + int proto_version; /* Protocol version */ + int type; /* Type of packet */ + autofs_wqt_t wait_queue_token; + __u32 dev; + __u64 ino; + __u32 uid; + __u32 gid; + __u32 pid; + __u32 tgid; + __u32 len; + char name[NAME_MAX+1]; + }; + +where the type is one of + + autofs_ptype_missing_indirect + autofs_ptype_expire_indirect + autofs_ptype_missing_direct + autofs_ptype_expire_direct + +so messages can indicate that a name is missing (something tried to +access it but it isn't there) or that it has been selected for expiry. + +The pipe will be set to "packet mode" (equivalent to passing +`O_DIRECT`) to _pipe2(2)_ so that a read from the pipe will return at +most one packet, and any unread portion of a packet will be discarded. + +The `wait_queue_token` is a unique number which can identify a +particular request to be acknowledged. When a message is sent over +the pipe the affected dentry is marked as either "active" or +"expiring" and other accesses to it block until the message is +acknowledged using one of the ioctls below and the relevant +`wait_queue_token`. + +Communicating with autofs: root directory ioctls +------------------------------------------------ + +The root directory of an autofs filesystem will respond to a number of +ioctls. The process issuing the ioctl must have the CAP_SYS_ADMIN +capability, or must be the automount daemon. + +The available ioctl commands are: + +- **AUTOFS_IOC_READY**: a notification has been handled. The argument + to the ioctl command is the "wait_queue_token" number + corresponding to the notification being acknowledged. +- **AUTOFS_IOC_FAIL**: similar to above, but indicates failure with + the error code `ENOENT`. +- **AUTOFS_IOC_CATATONIC**: Causes the autofs to enter "catatonic" + mode meaning that it stops sending notifications to the daemon. + This mode is also entered if a write to the pipe fails. +- **AUTOFS_IOC_PROTOVER**: This returns the protocol version in use. +- **AUTOFS_IOC_PROTOSUBVER**: Returns the protocol sub-version which + is really a version number for the implementation. It is + currently 2. +- **AUTOFS_IOC_SETTIMEOUT**: This passes a pointer to an unsigned + long. The value is used to set the timeout for expiry, and + the current timeout value is stored back through the pointer. +- **AUTOFS_IOC_ASKUMOUNT**: Returns, in the pointed-to `int`, 1 if + the filesystem could be unmounted. This is only a hint as + the situation could change at any instant. This call can be + use to avoid a more expensive full unmount attempt. +- **AUTOFS_IOC_EXPIRE**: as described above, this asks if there is + anything suitable to expire. A pointer to a packet: + + struct autofs_packet_expire_multi { + int proto_version; /* Protocol version */ + int type; /* Type of packet */ + autofs_wqt_t wait_queue_token; + int len; + char name[NAME_MAX+1]; + }; + + is required. This is filled in with the name of something + that can be unmounted or removed. If nothing can be expired, + `errno` is set to `EAGAIN`. Even though a `wait_queue_token` + is present in the structure, no "wait queue" is established + and no acknowledgment is needed. +- **AUTOFS_IOC_EXPIRE_MULTI**: This is similar to + **AUTOFS_IOC_EXPIRE** except that it causes notification to be + sent to the daemon, and it blocks until the daemon acknowledges. + The argument is an integer which can contain two different flags. + + **AUTOFS_EXP_IMMEDIATE** causes `last_used` time to be ignored + and objects are expired if the are not in use. + + **AUTOFS_EXP_LEAVES** will select a leaf rather than a top-level + name to expire. This is only safe when *maxproto* is 4. + +Communicating with autofs: char-device ioctls +--------------------------------------------- + +It is not always possible to open the root of an autofs filesystem, +particularly a *direct* mounted filesystem. If the automount daemon +is restarted there is no way for it to regain control of existing +mounts using any of the above communication channels. To address this +need there is a "miscellaneous" character device (major 10, minor 235) +which can be used to communicate directly with the autofs filesystem. +It requires CAP_SYS_ADMIN for access. + +The `ioctl`s that can be used on this device are described in a separate +document `autofs4-mount-control.txt`, and are summarized briefly here. +Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure: + + struct autofs_dev_ioctl { + __u32 ver_major; + __u32 ver_minor; + __u32 size; /* total size of data passed in + * including this struct */ + __s32 ioctlfd; /* automount command fd */ + + __u32 arg1; /* Command parameters */ + __u32 arg2; + + char path[0]; + }; + +For the **OPEN_MOUNT** and **IS_MOUNTPOINT** commands, the target +filesystem is identified by the `path`. All other commands identify +the filesystem by the `ioctlfd` which is a file descriptor open on the +root, and which can be returned by **OPEN_MOUNT**. + +The `ver_major` and `ver_minor` are in/out parameters which check that +the requested version is supported, and report the maximum version +that the kernel module can support. + +Commands are: + +- **AUTOFS_DEV_IOCTL_VERSION_CMD**: does nothing, except validate and + set version numbers. +- **AUTOFS_DEV_IOCTL_OPENMOUNT_CMD**: return an open file descriptor + on the root of an autofs filesystem. The filesystem is identified + by name and device number, which is stored in `arg1`. Device + numbers for existing filesystems can be found in + `/proc/self/mountinfo`. +- **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`. +- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in + catatonic mode, this can provide the write end of a new pipe + in `arg1` to re-establish communication with a daemon. The + process group of the calling process is used to identify the + daemon. +- **AUTOFS_DEV_IOCTL_REQUESTER_CMD**: `path` should be a + name within the filesystem that has been auto-mounted on. + arg1 is the dev number of the underlying autofs. On successful + return, `arg1` and `arg2` will be the UID and GID of the process + which triggered that mount. + +- **AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD**: Check if path is a + mountpoint of a particular type - see separate documentation for + details. + +- **AUTOFS_DEV_IOCTL_PROTOVER_CMD**: +- **AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD**: +- **AUTOFS_DEV_IOCTL_READY_CMD**: +- **AUTOFS_DEV_IOCTL_FAIL_CMD**: +- **AUTOFS_DEV_IOCTL_CATATONIC_CMD**: +- **AUTOFS_DEV_IOCTL_TIMEOUT_CMD**: +- **AUTOFS_DEV_IOCTL_EXPIRE_CMD**: +- **AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD**: These all have the same + function as the similarly named **AUTOFS_IOC** ioctls, except + that **FAIL** can be given an explicit error number in `arg1` + instead of assuming `ENOENT`, and this **EXPIRE** command + corresponds to **AUTOFS_IOC_EXPIRE_MULTI**. + +Catatonic mode +-------------- + +As mentioned, an autofs mount can enter "catatonic" mode. This +happens if a write to the notification pipe fails, or if it is +explicitly requested by an `ioctl`. + +When entering catatonic mode, the pipe is closed and any pending +notifications are acknowledged with the error `ENOENT`. + +Once in catatonic mode attempts to access non-existing names will +result in `ENOENT` while attempts to access existing directories will +be treated in the same way as if they came from the daemon, so mount +traps will not fire. + +When the filesystem is mounted a _uid_ and _gid_ can be given which +set the ownership of directories and symbolic links. When the +filesystem is in catatonic mode, any process with a matching UID can +create directories or symlinks in the root directory, but not in other +directories. + +Catatonic mode can only be left via the +**AUTOFS_DEV_IOCTL_OPENMOUNT_CMD** ioctl on the `/dev/autofs`. + +autofs, name spaces, and shared mounts +-------------------------------------- + +With bind mounts and name spaces it is possible for an autofs +filesystem to appear at multiple places in one or more filesystem +name spaces. For this to work sensibly, the autofs filesystem should +always be mounted "shared". e.g. + +> `mount --make-shared /autofs/mount/point` + +The automount daemon is only able to mange a single mount location for +an autofs filesystem and if mounts on that are not 'shared', other +locations will not behave as expected. In particular access to those +other locations will likely result in the `ELOOP` error + +> Too many levels of symbolic links -- cgit v1.2.3 From d67288da51b782f54dd3ae1455b997131160fd41 Mon Sep 17 00:00:00 2001 From: Chanwoo Choi Date: Mon, 13 Oct 2014 15:52:31 -0700 Subject: rtc: s3c: remove warning message when checking coding style with checkpatch script Remove warning message when checking codeing style with checkpatch script and reduce un-necessary i2c read operation on s3c_rtc_enable. WARNING: line over 80 characters #406: FILE: drivers/rtc/rtc-s3c.c:406: + if ((readw(info->base + S3C2410_RTCCON) & S3C2410_RTCCON_RTCEN) == 0) { WARNING: line over 80 characters #414: FILE: drivers/rtc/rtc-s3c.c:414: + if ((readw(info->base + S3C2410_RTCCON) & S3C2410_RTCCON_CNTSEL)) { WARNING: line over 80 characters #422: FILE: drivers/rtc/rtc-s3c.c:422: + if ((readw(info->base + S3C2410_RTCCON) & S3C2410_RTCCON_CLKRST)) { WARNING: Missing a blank line after declarations #451: FILE: drivers/rtc/rtc-s3c.c:451: + struct s3c_rtc_drv_data *data; + if (pdev->dev.of_node) { WARNING: Missing a blank line after declarations #453: FILE: drivers/rtc/rtc-s3c.c:453: + const struct of_device_id *match; + match = of_match_node(s3c_rtc_dt_match, pdev->dev.of_node); WARNING: DT compatible string "samsung,s3c2416-rtc" appears un-documented -- check ./Documentation/devicetree/bindings/ #650: FILE: drivers/rtc/rtc-s3c.c:650: + .compatible = "samsung,s3c2416-rtc", WARNING: DT compatible string "samsung,s3c2443-rtc" appears un-documented -- check ./Documentation/devicetree/bindings/ #653: FILE: drivers/rtc/rtc-s3c.c:653: + .compatible = "samsung,s3c2443-rtc", Signed-off-by: Chanwoo Choi Acked-by: Kyungmin Park Cc: Alessandro Zummo Cc: Kukjin Kim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/devicetree/bindings/rtc/s3c-rtc.txt | 2 ++ drivers/rtc/rtc-s3c.c | 26 ++++++++++++----------- 2 files changed, 16 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/rtc/s3c-rtc.txt b/Documentation/devicetree/bindings/rtc/s3c-rtc.txt index 7ac7259fe9ea..06db446ba71c 100644 --- a/Documentation/devicetree/bindings/rtc/s3c-rtc.txt +++ b/Documentation/devicetree/bindings/rtc/s3c-rtc.txt @@ -3,6 +3,8 @@ Required properties: - compatible: should be one of the following. * "samsung,s3c2410-rtc" - for controllers compatible with s3c2410 rtc. + * "samsung,s3c2416-rtc" - for controllers compatible with s3c2416 rtc. + * "samsung,s3c2443-rtc" - for controllers compatible with s3c2443 rtc. * "samsung,s3c6410-rtc" - for controllers compatible with s3c6410 rtc. - reg: physical base address of the controller and length of memory mapped region. diff --git a/drivers/rtc/rtc-s3c.c b/drivers/rtc/rtc-s3c.c index 2f71b13c00d6..4e95cca7615c 100644 --- a/drivers/rtc/rtc-s3c.c +++ b/drivers/rtc/rtc-s3c.c @@ -382,25 +382,25 @@ static const struct rtc_class_ops s3c_rtcops = { static void s3c_rtc_enable(struct s3c_rtc *info, int en) { - unsigned int tmp; + unsigned int con, tmp; clk_enable(info->rtc_clk); + + con = readw(info->base + S3C2410_RTCCON); if (!en) { - tmp = readw(info->base + S3C2410_RTCCON); if (info->cpu_type == TYPE_S3C64XX) - tmp &= ~S3C64XX_RTCCON_TICEN; - tmp &= ~S3C2410_RTCCON_RTCEN; - writew(tmp, info->base + S3C2410_RTCCON); + con &= ~S3C64XX_RTCCON_TICEN; + con &= ~S3C2410_RTCCON_RTCEN; + writew(con, info->base + S3C2410_RTCCON); if (info->cpu_type != TYPE_S3C64XX) { - tmp = readb(info->base + S3C2410_TICNT); - tmp &= ~S3C2410_TICNT_ENABLE; - writeb(tmp, info->base + S3C2410_TICNT); + con = readb(info->base + S3C2410_TICNT); + con &= ~S3C2410_TICNT_ENABLE; + writeb(con, info->base + S3C2410_TICNT); } } else { /* re-enable the device, and check it is ok */ - - if ((readw(info->base + S3C2410_RTCCON) & S3C2410_RTCCON_RTCEN) == 0) { + if ((con & S3C2410_RTCCON_RTCEN) == 0) { dev_info(info->dev, "rtc disabled, re-enabling\n"); tmp = readw(info->base + S3C2410_RTCCON); @@ -408,7 +408,7 @@ static void s3c_rtc_enable(struct s3c_rtc *info, int en) info->base + S3C2410_RTCCON); } - if ((readw(info->base + S3C2410_RTCCON) & S3C2410_RTCCON_CNTSEL)) { + if (con & S3C2410_RTCCON_CNTSEL) { dev_info(info->dev, "removing RTCCON_CNTSEL\n"); tmp = readw(info->base + S3C2410_RTCCON); @@ -416,7 +416,7 @@ static void s3c_rtc_enable(struct s3c_rtc *info, int en) info->base + S3C2410_RTCCON); } - if ((readw(info->base + S3C2410_RTCCON) & S3C2410_RTCCON_CLKRST)) { + if (con & S3C2410_RTCCON_CLKRST) { dev_info(info->dev, "removing RTCCON_CLKRST\n"); tmp = readw(info->base + S3C2410_RTCCON); @@ -445,8 +445,10 @@ static inline int s3c_rtc_get_driver_data(struct platform_device *pdev) { #ifdef CONFIG_OF struct s3c_rtc_drv_data *data; + if (pdev->dev.of_node) { const struct of_device_id *match; + match = of_match_node(s3c_rtc_dt_match, pdev->dev.of_node); data = (struct s3c_rtc_drv_data *) match->data; return data->cpu_type; -- cgit v1.2.3 From df9e26d093d33a097c5558aab017dd2f540ccfe5 Mon Sep 17 00:00:00 2001 From: Chanwoo Choi Date: Mon, 13 Oct 2014 15:52:35 -0700 Subject: rtc: s3c: add support for RTC of Exynos3250 SoC Add support for RTC of Exynos3250 SoC. The Exynos3250 needs source clock(32.768KHz) for RTC block. If source clock of RTC is registerd on clock list of common clk framework, Exynos RTC drvier have to control this clock. Clock list for s3c-rtc device: - rtc : CLK_RTC of CLK_GATE_IP_PERIR is gate clock for RTC. - rtc_src : XrtcXTI is 32.768.kHz source clock for RTC. (XRTCXTI: Specifies a clock from 32.768 kHz crystal pad with XRTCXTI and XRTCXTO pins. RTC uses this clock as the source of a real-time clock.) Signed-off-by: Chanwoo Choi Acked-by: Kyungmin Park Cc: Alessandro Zummo Cc: Kukjin Kim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/devicetree/bindings/rtc/s3c-rtc.txt | 1 + drivers/rtc/rtc-s3c.c | 93 ++++++++++++++++++++++- 2 files changed, 93 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/rtc/s3c-rtc.txt b/Documentation/devicetree/bindings/rtc/s3c-rtc.txt index 06db446ba71c..ab757b84daa7 100644 --- a/Documentation/devicetree/bindings/rtc/s3c-rtc.txt +++ b/Documentation/devicetree/bindings/rtc/s3c-rtc.txt @@ -6,6 +6,7 @@ Required properties: * "samsung,s3c2416-rtc" - for controllers compatible with s3c2416 rtc. * "samsung,s3c2443-rtc" - for controllers compatible with s3c2443 rtc. * "samsung,s3c6410-rtc" - for controllers compatible with s3c6410 rtc. + * "samsung,exynos3250-rtc" - for controllers compatible with exynos3250 rtc. - reg: physical base address of the controller and length of memory mapped region. - interrupts: Two interrupt numbers to the cpu should be specified. First diff --git a/drivers/rtc/rtc-s3c.c b/drivers/rtc/rtc-s3c.c index 0d9089228bb0..a6b1252c9941 100644 --- a/drivers/rtc/rtc-s3c.c +++ b/drivers/rtc/rtc-s3c.c @@ -38,6 +38,7 @@ struct s3c_rtc { void __iomem *base; struct clk *rtc_clk; + struct clk *rtc_src_clk; bool enabled; struct s3c_rtc_data *data; @@ -54,6 +55,7 @@ struct s3c_rtc { struct s3c_rtc_data { int max_user_freq; + bool needs_src_clk; void (*irq_handler) (struct s3c_rtc *info, int mask); void (*set_freq) (struct s3c_rtc *info, int freq); @@ -73,10 +75,14 @@ static void s3c_rtc_alarm_clk_enable(struct s3c_rtc *info, bool enable) if (enable) { if (!info->enabled) { clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); info->enabled = true; } } else { if (info->enabled) { + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); info->enabled = false; } @@ -114,12 +120,16 @@ static int s3c_rtc_setaie(struct device *dev, unsigned int enabled) dev_dbg(info->dev, "%s: aie=%d\n", __func__, enabled); clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); tmp = readb(info->base + S3C2410_RTCALM) & ~S3C2410_RTCALM_ALMEN; if (enabled) tmp |= S3C2410_RTCALM_ALMEN; writeb(tmp, info->base + S3C2410_RTCALM); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); s3c_rtc_alarm_clk_enable(info, enabled); @@ -134,12 +144,16 @@ static int s3c_rtc_setfreq(struct s3c_rtc *info, int freq) return -EINVAL; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); spin_lock_irq(&info->pie_lock); if (info->data->set_freq) info->data->set_freq(info, freq); spin_unlock_irq(&info->pie_lock); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -152,6 +166,9 @@ static int s3c_rtc_gettime(struct device *dev, struct rtc_time *rtc_tm) unsigned int have_retried = 0; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); + retry_get_time: rtc_tm->tm_min = readb(info->base + S3C2410_RTCMIN); rtc_tm->tm_hour = readb(info->base + S3C2410_RTCHOUR); @@ -185,6 +202,8 @@ static int s3c_rtc_gettime(struct device *dev, struct rtc_time *rtc_tm) rtc_tm->tm_mon -= 1; + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return rtc_valid_tm(rtc_tm); @@ -207,6 +226,8 @@ static int s3c_rtc_settime(struct device *dev, struct rtc_time *tm) } clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); writeb(bin2bcd(tm->tm_sec), info->base + S3C2410_RTCSEC); writeb(bin2bcd(tm->tm_min), info->base + S3C2410_RTCMIN); @@ -215,6 +236,8 @@ static int s3c_rtc_settime(struct device *dev, struct rtc_time *tm) writeb(bin2bcd(tm->tm_mon + 1), info->base + S3C2410_RTCMON); writeb(bin2bcd(year), info->base + S3C2410_RTCYEAR); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -227,6 +250,9 @@ static int s3c_rtc_getalarm(struct device *dev, struct rtc_wkalrm *alrm) unsigned int alm_en; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); + alm_tm->tm_sec = readb(info->base + S3C2410_ALMSEC); alm_tm->tm_min = readb(info->base + S3C2410_ALMMIN); alm_tm->tm_hour = readb(info->base + S3C2410_ALMHOUR); @@ -278,7 +304,10 @@ static int s3c_rtc_getalarm(struct device *dev, struct rtc_wkalrm *alrm) else alm_tm->tm_year = -1; + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); + return 0; } @@ -289,6 +318,9 @@ static int s3c_rtc_setalarm(struct device *dev, struct rtc_wkalrm *alrm) unsigned int alrm_en; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); + dev_dbg(dev, "s3c_rtc_setalarm: %d, %04d.%02d.%02d %02d:%02d:%02d\n", alrm->enabled, 1900 + tm->tm_year, tm->tm_mon + 1, tm->tm_mday, @@ -318,6 +350,8 @@ static int s3c_rtc_setalarm(struct device *dev, struct rtc_wkalrm *alrm) s3c_rtc_setaie(dev, alrm->enabled); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -328,10 +362,14 @@ static int s3c_rtc_proc(struct device *dev, struct seq_file *seq) struct s3c_rtc *info = dev_get_drvdata(dev); clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); if (info->data->enable_tick) info->data->enable_tick(info, seq); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -351,6 +389,8 @@ static void s3c24xx_rtc_enable(struct s3c_rtc *info) unsigned int con, tmp; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); con = readw(info->base + S3C2410_RTCCON); /* re-enable the device, and check it is ok */ @@ -378,6 +418,8 @@ static void s3c24xx_rtc_enable(struct s3c_rtc *info) info->base + S3C2410_RTCCON); } + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); } @@ -386,6 +428,8 @@ static void s3c24xx_rtc_disable(struct s3c_rtc *info) unsigned int con; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); con = readw(info->base + S3C2410_RTCCON); con &= ~S3C2410_RTCCON_RTCEN; @@ -395,6 +439,8 @@ static void s3c24xx_rtc_disable(struct s3c_rtc *info) con &= ~S3C2410_TICNT_ENABLE; writeb(con, info->base + S3C2410_TICNT); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); } @@ -403,12 +449,16 @@ static void s3c6410_rtc_disable(struct s3c_rtc *info) unsigned int con; clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); con = readw(info->base + S3C2410_RTCCON); con &= ~S3C64XX_RTCCON_TICEN; con &= ~S3C2410_RTCCON_RTCEN; writew(con, info->base + S3C2410_RTCCON); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); } @@ -480,11 +530,19 @@ static int s3c_rtc_probe(struct platform_device *pdev) info->rtc_clk = devm_clk_get(&pdev->dev, "rtc"); if (IS_ERR(info->rtc_clk)) { - dev_err(&pdev->dev, "failed to find rtc clock source\n"); + dev_err(&pdev->dev, "failed to find rtc clock\n"); return PTR_ERR(info->rtc_clk); } clk_prepare_enable(info->rtc_clk); + info->rtc_src_clk = devm_clk_get(&pdev->dev, "rtc_src"); + if (IS_ERR(info->rtc_src_clk)) { + dev_err(&pdev->dev, "failed to find rtc source clock\n"); + return PTR_ERR(info->rtc_src_clk); + } + clk_prepare_enable(info->rtc_src_clk); + + /* check to see if everything is setup correctly */ if (info->data->enable) info->data->enable(info); @@ -538,6 +596,8 @@ static int s3c_rtc_probe(struct platform_device *pdev) s3c_rtc_setfreq(info, 1); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -557,6 +617,8 @@ static int s3c_rtc_suspend(struct device *dev) struct s3c_rtc *info = dev_get_drvdata(dev); clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); /* save TICNT for anyone using periodic interrupts */ if (info->data->save_tick_cnt) @@ -572,6 +634,8 @@ static int s3c_rtc_suspend(struct device *dev) dev_err(dev, "enable_irq_wake failed\n"); } + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -582,6 +646,8 @@ static int s3c_rtc_resume(struct device *dev) struct s3c_rtc *info = dev_get_drvdata(dev); clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); if (info->data->enable) info->data->enable(info); @@ -594,6 +660,8 @@ static int s3c_rtc_resume(struct device *dev) info->wake_en = false; } + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); return 0; @@ -604,7 +672,11 @@ static SIMPLE_DEV_PM_OPS(s3c_rtc_pm_ops, s3c_rtc_suspend, s3c_rtc_resume); static void s3c24xx_rtc_irq(struct s3c_rtc *info, int mask) { clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); rtc_update_irq(info->rtc, 1, RTC_AF | RTC_IRQF); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); s3c_rtc_alarm_clk_enable(info, false); @@ -613,8 +685,12 @@ static void s3c24xx_rtc_irq(struct s3c_rtc *info, int mask) static void s3c6410_rtc_irq(struct s3c_rtc *info, int mask) { clk_enable(info->rtc_clk); + if (info->data->needs_src_clk) + clk_enable(info->rtc_src_clk); rtc_update_irq(info->rtc, 1, RTC_AF | RTC_IRQF); writeb(mask, info->base + S3C2410_INTP); + if (info->data->needs_src_clk) + clk_disable(info->rtc_src_clk); clk_disable(info->rtc_clk); s3c_rtc_alarm_clk_enable(info, false); @@ -780,6 +856,18 @@ static struct s3c_rtc_data const s3c6410_rtc_data = { .disable = s3c6410_rtc_disable, }; +static struct s3c_rtc_data const exynos3250_rtc_data = { + .max_user_freq = 32768, + .needs_src_clk = true, + .irq_handler = s3c6410_rtc_irq, + .set_freq = s3c6410_rtc_setfreq, + .enable_tick = s3c6410_rtc_enable_tick, + .save_tick_cnt = s3c6410_rtc_save_tick_cnt, + .restore_tick_cnt = s3c6410_rtc_restore_tick_cnt, + .enable = s3c24xx_rtc_enable, + .disable = s3c6410_rtc_disable, +}; + static const struct of_device_id s3c_rtc_dt_match[] = { { .compatible = "samsung,s3c2410-rtc", @@ -793,6 +881,9 @@ static const struct of_device_id s3c_rtc_dt_match[] = { }, { .compatible = "samsung,s3c6410-rtc", .data = (void *)&s3c6410_rtc_data, + }, { + .compatible = "samsung,exynos3250-rtc", + .data = (void *)&exynos3250_rtc_data, }, { /* sentinel */ }, }; -- cgit v1.2.3 From 2ac848c018615bf3605faa711207518292d4bfef Mon Sep 17 00:00:00 2001 From: Matti Vaittinen Date: Mon, 13 Oct 2014 15:52:46 -0700 Subject: Documentation: dt-bindings: trickle charger dt binding document for ds1339 Some DS13XX devices have "trickle chargers". Introduce a device tree binding for the resistor and diode configuration for enabling trickle charger. Signed-off-by: Matti Vaittinen Acked-by: Jason Cooper Cc: Rob Herring Cc: Pawel Moll Cc: Guenter Roeck Cc: Alessandro Zummo Cc: Mark Rutland Cc: Pavel Machek Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- .../devicetree/bindings/i2c/trivial-devices.txt | 1 - .../devicetree/bindings/rtc/dallas,ds1339.txt | 18 ++++++++++++++++++ 2 files changed, 18 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/rtc/dallas,ds1339.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt b/Documentation/devicetree/bindings/i2c/trivial-devices.txt index 5af3d9df6ecb..fbde415078e6 100644 --- a/Documentation/devicetree/bindings/i2c/trivial-devices.txt +++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt @@ -35,7 +35,6 @@ catalyst,24c32 i2c serial eeprom cirrus,cs42l51 Cirrus Logic CS42L51 audio codec dallas,ds1307 64 x 8, Serial, I2C Real-Time Clock dallas,ds1338 I2C RTC with 56-Byte NV RAM -dallas,ds1339 I2C Serial Real-Time Clock dallas,ds1340 I2C RTC with Trickle Charger dallas,ds1374 I2C, 32-Bit Binary Counter Watchdog RTC with Trickle Charger and Reset Input/Output dallas,ds1631 High-Precision Digital Thermometer diff --git a/Documentation/devicetree/bindings/rtc/dallas,ds1339.txt b/Documentation/devicetree/bindings/rtc/dallas,ds1339.txt new file mode 100644 index 000000000000..916f57601a8f --- /dev/null +++ b/Documentation/devicetree/bindings/rtc/dallas,ds1339.txt @@ -0,0 +1,18 @@ +* Dallas DS1339 I2C Serial Real-Time Clock + +Required properties: +- compatible: Should contain "dallas,ds1339". +- reg: I2C address for chip + +Optional properties: +- trickle-resistor-ohms : Selected resistor for trickle charger + Values usable for ds1339 are 250, 2000, 4000 + Should be given if trickle charger should be enabled +- trickle-diode-disable : Do not use internal trickle charger diode + Should be given if internal trickle charger diode should be disabled +Example: + ds1339: rtc@68 { + compatible = "dallas,ds1339"; + trickle-resistor-ohms = <250>; + reg = <0x68>; + }; -- cgit v1.2.3 From d5fae669a99d00dc9362da354f2b9fdfbeb669a7 Mon Sep 17 00:00:00 2001 From: Pavel Machek Date: Mon, 13 Oct 2014 15:52:52 -0700 Subject: rtc: bq32000: add trickle charger device tree binding BQ32000 have "trickle chargers". Introduce a device tree binding for specifying the trickle charger configuration for that. Signed-off-by: Pavel Machek Reviewed-by: Jason Cooper Cc: Matti Vaittinen Cc: Rob Herring Cc: Ian Campbell Cc: Alessandro Zummo Cc: Guenter Roeck Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/devicetree/bindings/i2c/ti,bq32k.txt | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 Documentation/devicetree/bindings/i2c/ti,bq32k.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/i2c/ti,bq32k.txt b/Documentation/devicetree/bindings/i2c/ti,bq32k.txt new file mode 100644 index 000000000000..e204906b9ad3 --- /dev/null +++ b/Documentation/devicetree/bindings/i2c/ti,bq32k.txt @@ -0,0 +1,18 @@ +* TI BQ32000 I2C Serial Real-Time Clock + +Required properties: +- compatible: Should contain "ti,bq32000". +- reg: I2C address for chip + +Optional properties: +- trickle-resistor-ohms : Selected resistor for trickle charger + Values usable are 1120 and 20180 + Should be given if trickle charger should be enabled +- trickle-diode-disable : Do not use internal trickle charger diode + Should be given if internal trickle charger diode should be disabled +Example: + bq32000: rtc@68 { + compatible = "ti,bq32000"; + trickle-resistor-ohms = <1120>; + reg = <0x68>; + }; -- cgit v1.2.3 From b03023ecbdb76c1dec86b41ed80b123c22783220 Mon Sep 17 00:00:00 2001 From: Oleg Nesterov Date: Mon, 13 Oct 2014 15:53:35 -0700 Subject: coredump: add %i/%I in core_pattern to report the tid of the crashed thread format_corename() can only pass the leader's pid to the core handler, but there is no simple way to figure out which thread originated the coredump. As Jan explains, this also means that there is no simple way to create the backtrace of the crashed process: As programs are mostly compiled with implicit gcc -fomit-frame-pointer one needs program's .eh_frame section (equivalently PT_GNU_EH_FRAME segment) or .debug_frame section. .debug_frame usually is present only in separate debug info files usually not even installed on the system. While .eh_frame is a part of the executable/library (and it is even always mapped for C++ exceptions unwinding) it no longer has to be present anywhere on the disk as the program could be upgraded in the meantime and the running instance has its executable file already unlinked from disk. One possibility is to echo 0x3f >/proc/*/coredump_filter and dump all the file-backed memory including the executable's .eh_frame section. But that can create huge core files, for example even due to mmapped data files. Other possibility would be to read .eh_frame from /proc/PID/mem at the core_pattern handler time of the core dump. For the backtrace one needs to read the register state first which can be done from core_pattern handler: ptrace(PTRACE_SEIZE, tid, 0, PTRACE_O_TRACEEXIT) close(0); // close pipe fd to resume the sleeping dumper waitpid(); // should report EXIT PTRACE_GETREGS or other requests The remaining problem is how to get the 'tid' value of the crashed thread. It could be read from the first NT_PRSTATUS note of the core file but that makes the core_pattern handler complicated. Unfortunately %t is already used so this patch uses %i/%I. Automatic Bug Reporting Tool (https://github.com/abrt/abrt/wiki/overview) is experimenting with this. It is using the elfutils (https://fedorahosted.org/elfutils/) unwinder for generating the backtraces. Apart from not needing matching executables as mentioned above, another advantage is that we can get the backtrace without saving the core (which might be quite large) to disk. [mmilata@redhat.com: final paragraph of changelog] Signed-off-by: Jan Kratochvil Signed-off-by: Oleg Nesterov Cc: Alexander Viro Cc: Denys Vlasenko Cc: Jan Kratochvil Cc: Mark Wielaard Cc: Martin Milata Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/sysctl/kernel.txt | 2 ++ fs/coredump.c | 8 ++++++++ 2 files changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index f79eb9666379..57baff5bdb80 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -190,6 +190,8 @@ core_pattern is used to specify a core dumpfile pattern name. %% output one '%' %p pid %P global pid (init PID namespace) + %i tid + %I global tid (init PID namespace) %u uid %g gid %d dump mode, matches PR_SET_DUMPABLE and diff --git a/fs/coredump.c b/fs/coredump.c index a93f7e6ea4cf..b5c86ffd5033 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -199,6 +199,14 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm) err = cn_printf(cn, "%d", task_tgid_nr(current)); break; + case 'i': + err = cn_printf(cn, "%d", + task_pid_vnr(current)); + break; + case 'I': + err = cn_printf(cn, "%d", + task_pid_nr(current)); + break; /* uid */ case 'u': err = cn_printf(cn, "%d", cred->uid); -- cgit v1.2.3 From 71dca95d5cf5ece6c1bee8e625e23c16025952c7 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Mon, 13 Oct 2014 15:55:18 -0700 Subject: lib/vsprintf: add %*pE[achnops] format specifier This allows user to print a given buffer as an escaped string. The rules are applied according to an optional mix of flags provided by additional format letters. For example, if the given buffer is: 1b 62 20 5c 43 07 22 90 0d 5d The result strings would be: %*pE "\eb \C\a"\220\r]" %*pEhp "\x1bb \C\x07"\x90\x0d]" %*pEa "\e\142\040\\\103\a\042\220\r\135" Please, read Documentation/printk-formats.txt and lib/string_helpers.c kernel documentation to get further information. [akpm@linux-foundation.org: tidy up comment layout, per Joe] Signed-off-by: Andy Shevchenko Suggested-by: Joe Perches Cc: "John W . Linville" Cc: Johannes Berg Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/printk-formats.txt | 32 ++++++++++++++++++ lib/vsprintf.c | 71 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+) (limited to 'Documentation') diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt index b4498218c474..5a615c14f75d 100644 --- a/Documentation/printk-formats.txt +++ b/Documentation/printk-formats.txt @@ -70,6 +70,38 @@ DMA addresses types dma_addr_t: For printing a dma_addr_t type which can vary based on build options, regardless of the width of the CPU data path. Passed by reference. +Raw buffer as an escaped string: + + %*pE[achnops] + + For printing raw buffer as an escaped string. For the following buffer + + 1b 62 20 5c 43 07 22 90 0d 5d + + few examples show how the conversion would be done (the result string + without surrounding quotes): + + %*pE "\eb \C\a"\220\r]" + %*pEhp "\x1bb \C\x07"\x90\x0d]" + %*pEa "\e\142\040\\\103\a\042\220\r\135" + + The conversion rules are applied according to an optional combination + of flags (see string_escape_mem() kernel documentation for the + details): + a - ESCAPE_ANY + c - ESCAPE_SPECIAL + h - ESCAPE_HEX + n - ESCAPE_NULL + o - ESCAPE_OCTAL + p - ESCAPE_NP + s - ESCAPE_SPACE + By default ESCAPE_ANY_NP is used. + + ESCAPE_ANY_NP is the sane choice for many cases, in particularly for + printing SSIDs. + + If field width is omitted the 1 byte only will be escaped. + Raw buffer as a hex string: %*ph 00 01 02 ... 3f %*phC 00:01:02: ... :3f diff --git a/lib/vsprintf.c b/lib/vsprintf.c index ba3cd0a35640..ec337f64f52d 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -33,6 +33,7 @@ #include /* for PAGE_SIZE */ #include /* for dereference_function_descriptor() */ +#include #include "kstrtox.h" /** @@ -1100,6 +1101,62 @@ char *ip4_addr_string_sa(char *buf, char *end, const struct sockaddr_in *sa, return string(buf, end, ip4_addr, spec); } +static noinline_for_stack +char *escaped_string(char *buf, char *end, u8 *addr, struct printf_spec spec, + const char *fmt) +{ + bool found = true; + int count = 1; + unsigned int flags = 0; + int len; + + if (spec.field_width == 0) + return buf; /* nothing to print */ + + if (ZERO_OR_NULL_PTR(addr)) + return string(buf, end, NULL, spec); /* NULL pointer */ + + + do { + switch (fmt[count++]) { + case 'a': + flags |= ESCAPE_ANY; + break; + case 'c': + flags |= ESCAPE_SPECIAL; + break; + case 'h': + flags |= ESCAPE_HEX; + break; + case 'n': + flags |= ESCAPE_NULL; + break; + case 'o': + flags |= ESCAPE_OCTAL; + break; + case 'p': + flags |= ESCAPE_NP; + break; + case 's': + flags |= ESCAPE_SPACE; + break; + default: + found = false; + break; + } + } while (found); + + if (!flags) + flags = ESCAPE_ANY_NP; + + len = spec.field_width < 0 ? 1 : spec.field_width; + + /* Ignore the error. We print as many characters as we can */ + string_escape_mem(addr, len, &buf, end - buf, flags, NULL); + + return buf; +} + static noinline_for_stack char *uuid_string(char *buf, char *end, const u8 *addr, struct printf_spec spec, const char *fmt) @@ -1221,6 +1278,17 @@ int kptr_restrict __read_mostly; * - '[Ii][4S][hnbl]' IPv4 addresses in host, network, big or little endian order * - 'I[6S]c' for IPv6 addresses printed as specified by * http://tools.ietf.org/html/rfc5952 + * - 'E[achnops]' For an escaped buffer, where rules are defined by combination + * of the following flags (see string_escape_mem() for the + * details): + * a - ESCAPE_ANY + * c - ESCAPE_SPECIAL + * h - ESCAPE_HEX + * n - ESCAPE_NULL + * o - ESCAPE_OCTAL + * p - ESCAPE_NP + * s - ESCAPE_SPACE + * By default ESCAPE_ANY_NP is used. * - 'U' For a 16 byte UUID/GUID, it prints the UUID/GUID in the form * "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" * Options for %pU are: @@ -1321,6 +1389,8 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr, }} } break; + case 'E': + return escaped_string(buf, end, ptr, spec, fmt); case 'U': return uuid_string(buf, end, ptr, spec, fmt); case 'V': @@ -1633,6 +1703,7 @@ qualifier: * %piS depending on sa_family of 'struct sockaddr *' print IPv4/IPv6 address * %pU[bBlL] print a UUID/GUID in big or little endian using lower or upper * case. + * %*pE[achnops] print an escaped buffer * %*ph[CDN] a variable-length hex string with a separator (supports up to 64 * bytes of the input) * %n is ignored -- cgit v1.2.3